iAmCaffeine wrote:Except we know that the same sample of dice is being re-used over and over.
Incorrect. We don't "know" that. It's a theory that's been passed around, and I believe the theory is incorrect.
degaston wrote:Dukasaur wrote:... The "haven't been replenished" allegation is pure nonsense. The skew is pretty definitely proven, but it's far more likely to result from a rounding error than from he alleged non-replenishment.
How can you call it nonsense? A mod has already admitted that the "haven't been replenished" allegation is a possibility that would account for the results we're seeing:Metsfanmax wrote:We were told that the list is replaced every hour. So one possible failure mode is if, for some reason, the list is no longer updating and got stuck on a list that was particularly non-uniform. But if the lists are still updating once per hour, then your explanation would require the sum total of those lists to be non-uniform.
His second sentence essentially confirms that any other explanation is extremely unlikely. A rounding error does not make any sense because there is nothing to round. The numbers they get from random.org are integers (1-6), and I've already tested numbers directly from random.org and found no bias. Also, the skew is not just to the right, - it skews away from 1's, and towards 2's and 4's. That could not be the result from a simple rounding error.
That is not true.
Let's take your points one at a time.
First, the idea that Mets has some unique knowlege of the process. He's a mod, and a Team Leader, and so am I. We're not given the keys to the inner office. The workings of the core engine are pretty much strictly webmaster territory. Now, Mets works on the technical side of it and knows the stuff a little better than I do, but he's still guessing. And granted, he's a graduate student in astrophysics and a pretty smart cookie, but still, he's guessing, just as you and I are. And anyway, we're dealing with pretty basic high school algebra here, so even a rube like me can probably cope.
Second, the idea that random.org provides integers. Incorrect. If you and I go to random.org and ask for 100 or even 1000 integers, then yes, we will get them. That's because those are small numbers and the cost of crunching the numbers for you is small enough that they can afford to take the hit. But when you buy numbers from random.org in commercial quantities, they come in byte form (See: http://www.random.org/files/) and you have to convert them to integers yourself.
That's pretty easy if you want binary numbers. If we were playing RPGs here and rolling a lot of D4s and D8s, that would be great. But CC plays with D6s, and bytes are not easily converted to D6. A byte gives you a discrete value from 0 to 255, ergo 256 discrete values, and 6 does not go into 256 evenly. If it did, it would be a ludicrously simple calculation to do bitewise modulus division on the byte and come up with nice D6s.
Alas, 6 = 3 x 2 and there is no easy way to divide 3 into any binary root. Oh! the 3, that Buddha, that Janus! Such a godlike blessing in geometry and such a demonic curse in arithmetic. Any process of division by 3 results in a repeating decimal and a guaranteed source of error.
Furthermore, if you do want to divide integers by 3 and start getting into decimal fractions, you're talking a massive hit on the servers. Again, not a big deal when you're processing 100s or 1000s, but when you need to roll millions of virtual dice every day, it's probably enough to force a more powerful processor and boost the server cost from $1500/month to $2000/month. Now, to Exxon that might be pocket money, but if you're a small business with gross annual revenues in the $125,000 range, you have to pay the customary rake-offs to the government (ie taxes) and the banks (ie. processing percentages on online payments, which are not cheap, let me tell you!) pay three staff members, and somehow still find the money to keep your servers running, an extra $500/month is a really big deal, so if I was lack/Jefe/Wham, I would avoid floating-point operations at all costs, and go with strict bitwise math based on the discrete values in the byte.
What are some of the tricks we could pull to squeeze a number divisible by 6 out of 256? Well, there's a whole pile of possibilities. I'm not going to bother analysing all of them, for two reasons. First, I really don't care all that much. Second, I'm only guessing, and there's a limit to how much time one should waste on guesses with no real facts.
You can take one byte, multiply it by 2, use one bit from a different bite to dither the succeeding value so that it contains odd numbers, and do MOD(6) on the result. Or you could do regular (non-modulus) division on the 255 and come up with segments of either 42 or 43 values in length, assigning a value between 1 and 6 to each segment.
Tantalizingly, if I ballpark the figures on your graphs, the difference between rolling a 1 and rolling a 2 is 1/((16.775/16.425)-1) which equals an error of 1/47, pretty damn close to the 1/43 error that I would expect by the segmentation method. That's what originally led me to believe (many months ago, when you first posted the graph) that this was a rounding error, not a randomization error.
Now that I've shown you a direction to look in, you'll probably be able to hunt around and find some formulas much better than the ones I've shown. I personally don't care. Why do I not care? First, because as you admit yourself, the bias is the same for everyone, so nobody loses or gains from this. There would only be a concern if the bias was somehow targetted against some subset of our members, which it isn't. Second, 95% of the dice bitch threads are made by people who complain about the streakiness of the dice, not the bias against ones, and streakiness is a perfectly normal feature of truly random numbers. People who think that just because they rolled twenty 5s in a row, they are now due to roll something other than a 5, are engaging in the gambler's fallacy.
You've really done a disservice to the people engaging in the dice bitching. By showing that there is an error in terms of skewness, you've reinforced their paranoid delusions about streakiness, even though by your own admission the one has nothing to do with the other.