How accurate are the d20 dice used in role playing games?

Question

The website Thinkgeek sells "High precision dice", citing this video as the explanation.

No, you're not being superstitious when you stand at the counter at your Friendly Local Gaming Store test-rolling all the d20s. There ARE in fact "unlucky" dice that always roll poorly. When you have a spare 20 minutes, check out the video on YouTube of Lou Zocchi of GameScience explaining why these High Precision Gaming Dice are the most accurate ones you'll find anywhere. It's fascinating stuff and you'll never see dice the same way again.

Is there any truth to the claims made by this man and Thinkgeek, or is it just a marketing gimmick?

Related question: Is a coin toss fair?

If there are some dice that always roll poorly, then one can assume that there are some dice that always roll well. Wouldn't it make more sense to get a die that always rolls high as opposed to a "fair" dice? — Kibbee, Nov 25 '11 at 15:38
@Kibbee, Zocchi's main claims seemed to be that the dice are (a) egg-shaped and (b) too round, which enhances the effect of the egg-shape. The egg-shape means certain *opposite* pairs are more likely. Dice are traditionally made with opposite numbers adding up to same total (number of sides+1). Therefore, rather than a "tends to roll high", you might get an "tends to roll 1 OR 20" or "tends to roll middling numbers". Zocchi claims such middling dice, in the hands of the Game Master, would favour early player-character death, but doesn't justify why that would be the outcome. — Oddthinking, Nov 25 '11 at 16:26
@DVK (+1): ...but what does this have to do with Network Emergency Response Dudes? ;-P — Randolf Richardson, Nov 26 '11 at 00:25
I don't doubt some manufacturers have poor QA standards and ship dice that are improperly balanced when others have higher standards and would remove such from the production line. Whether this specific manufacturer is one of the latter I've no way of knowing. — jwenting, Nov 28 '11 at 06:59
@Oddthinking: a DM who rolls more 1's and 20's than they statistically should would cause more critical failures and critical successes than would normally happen in an average campaign. — WWW, Nov 28 '11 at 19:53
@Oddthinking - Not all dice follow those rules either. I have some 20 sided dice that do not. The same is true of some of the 8 sided dice I have. — Chad, Nov 28 '11 at 20:25
@crontab: Would they cancel out their effect on player longevity? Wouldn't the adversaries also suffer? — Oddthinking, Nov 28 '11 at 22:40
@Chad: :-( Have modern manufacturers no respect for tradition? Did they do no research or apprenticeships before making their inferior rubbish? Okay, I am being facetious, but I have to admit I find such an idea confronting! I wouldn't buy them! — Oddthinking, Nov 28 '11 at 22:43
@Oddthinking: if the DM is rolling for something that if the roll is critical could cause instant death and the DM has a tendency to roll 20's, it would be bad. (sorry about the horribly long sentence) A DM who causes their players to die more often than average wouldn't be very popular considering the time spent investing in a player character. Non-player characters could also suffer, but there usually isn't nearly as much time developing those characters. — WWW, Nov 28 '11 at 23:53

score 16 · Accepted Answer · edited Jun 17 '20 at 09:41

16

For all the claims of Lou Zocchi, there's only one way to be sure - science! i.e. Do the experiment and check if his dice are truly more random than his competitors.

How do you do the experiment? Delta's D&D HotSpot is a blog written by a math teacher, and he wrote an article about how to apply Pearson's chi-squared hypothesis testing to this problem.

He then followed it up with an informal experiment, where he applied the testing method to a number of d20 (20 sided = icosahedral) dice he owned. Coincidentally he owned an old d20 die which he believes is one of Lou Zocchi's. Sure enough, it gave the lowest figure of error, informally supporting Zocchi's claims.

Now at the end, I tested what I presumed would be the weakest die in my collection: an older translucent red d20, with sharp edges, that I had to color in myself with a crayon. The other dice in this set still show the tab from where it was snapped off the molding sprue (although I can't see it on the d20 itself; these dice are probably from Gamescience). Well, unexpectedly to me, this d20 had the lowest error of the bunch: SSE = 80, ~~significantly~~ lower than anything else I had in the house, ~~and clearly the fairest-rolling die of anything I tested (P-value = 0.66).~~

So my theory now would be that a die that has sharp edges is more likely to roll fairly than one that has rounded edges, even though I've been avoiding this "sharp-edged" set for years now because to my eye it looked less professional.

Strike-outs added by me, where the author overstepped what could be safely concluded. See below.

Limitations

As explained in the comments by @Konrad Rudolph, it is not a valid conclusion from these results to rank the dice by their SSE. The author's calculation of a very large p-value is also in keeping with this statement not being reliable.
All we can conclude is that none of the dice behaved inconsistently with being completely balanced. That's a lot of double-negatives: All the dice appeared fine for the limited results available.
The author didn't test for long enough to conclude any of the dice were actually balanced. In a follow up, he calculates a much longer test would be required.
He didn't test a wide range of brands to confirm all of Zocchi's competitors suffer the same problem.
He didn't test a large sample of dice within the brands to confirm that the quality of the dice were consistent within the batch.
He didn't test each dice over a range of ages to confirm that different dice don't change in quality over time.
It wasn't peer-reviewed and I haven't seen it reproduced.
There is a small but significant risk of Type I errors (i.e. true dice being classified as untrue.) which is one of the reasons to want to see it reproduced.

So the result is better than anecdotal-with-confirmation-bias, but still very limited in its power. I'd like to see someone find a more comprehensive answer.

edited Jun 17 '20 at 09:41

Community

1

answered Nov 25 '11 at 16:19

Oddthinking

140,378
46
548
638

2

A p-value of 0.66 is *anything but* significant. Is this a typo? Furthermore, the guy seems to interpret p-values *completely* wrong – not just slightly, *completely*. In fact, he seems to find no evidence either way, *not*, as he claims, evidence that dice are not biased. He chose the completely wrong null hypothesis to test against. That said, the SSEs cited by him *seem* to yield significantly different results so applying the significance test the right way round *would* probably yield quite low p-values (I haven’t done the calculation, just eyeballing). – Konrad Rudolph Nov 26 '11 at 20:54
@Konrad, please elaborate some more. My understanding: He was really testing each die to see if there was evidence it was biased. His line at the end, which I quoted, was a different question: whether the Gamescience dice was the least biased of the set, for which he provided little evidence (poor p-values). The [author follows-up](http://deltasdnd.blogspot.com/2011/10/testing-balanced-dice-power.html) which seems to suggest that much larger values of n would be required to prove a die was unbiased, rather than to prove it was moderately biased. – Oddthinking Nov 26 '11 at 23:42
1

Ok, having read the previous post I see how he applies the chi-square test. Nevertheless, an SSE value < 150 merely tells us that we *cannot reject* the null hypothesis. It does not, strictly speaking, tell us that it’s true. What we *want*, in order to draw a statistically founded conclusion, is to find a null hypothesis which we can ultimately reject with high confidence. He does the opposite. Of course this *is* enough to show that the SSEs are well within an acceptable margin explained by sampling variations. But his formulations are misleading, and in the last case outright wrong. – Konrad Rudolph Nov 27 '11 at 11:40
2

To make this clear: the presented statistic does not give us any way of judging whether the sharp-edged die performs better than the blunt-edged one. Of course, just looking at the SSEs makes this very plausible; but taking the p-values and concluding anything from their relative magnitude is a common error. All those values tell us is whether to reject H0. They do *not* give us a ranking. Or anything else. – Konrad Rudolph Nov 27 '11 at 11:44
Is his evaluation that p=0.66 (for the statement he made) right? (I am trying to get the story very clear, so I can edit the answer to make the distinction understandable to a casual reader.) The difficulty I always have with this area is you can't decide whether this particular dice is likely to be biased, until you know the distribution of loaded dice. It takes far more rolls to convince me a dice at a well-regulated casino is dodgy compared to one being used by a magician in a show. – Oddthinking Nov 27 '11 at 12:22
@KonradRudolph: I have made some changes. Would appreciate a review. – Oddthinking Nov 27 '11 at 12:36
Fine as far as I’m concerned. You certainly elaborated the conclusions that *can* be drawn rather well, and showed the limitations of the tests. Also, your remark about casino vs. dodgy magician brings us back to Bayes and the proper choice of priors. Well, in the general case we don’t have any prior information so assuming the default as done by Pearson’s chi-square test is the best we can do. – Konrad Rudolph Nov 28 '11 at 09:30
Just saw this question for the first time. I did my own 500 roll chi-square test of a Chessex Orange, Green, and Purple last year if anyone would find the results interesting: http://blog.codeoptimism.com/most-d20-dice-are-notably-imbalanced/ Spoiler: pretty bad – Christopher Galpin Apr 08 '14 at 22:16

How accurate are the d20 dice used in role playing games?

1 Answers1