Does R incorrectly compute the chi-squared statistic for 2x2 tables with low cell counts?

Question

I just noticed that for 2 x 2 tables where cells have low frequencies, even with the Yates correction, R seems to be computing chi^2 statistics incorrectly.

mat <- matrix(c(3, 2, 14, 10), ncol = 2)
chi <- stats::chisq.test(mat)
## Warning message:
## In stats::chisq.test(mat) : Chi-squared approximation may be incorrect

# from the function
chi$statistic
##    X-squared 
## 1.626059e-31 

# as it should be (with Yates correction)
sum((abs(chi$observed - chi$expected) - 0.5)^2 / chi$expected)
## [1] 0.1851001

Am I right in thinking that R is computing it incorrectly, and that the second method yielding .185 is more accurate? Or do the small cell counts mean that all bets are off?

Update:

It does seem to work fine without the Yates continuity correction:

chi <- stats::chisq.test(mat, correct = FALSE)
## Warning message:
## In stats::chisq.test(mat, correct = FALSE) :
##   Chi-squared approximation may be incorrect

chi$statistic
##   X-squared 
## 0.004738562 

sum((abs(chi$observed - chi$expected))^2 / chi$expected)
## [1] 0.004738562

Look into [Fisher's Exact Test](https://www.google.com/#q=fisher+exact+test+vs+chi+square) over chi-squared for smaller cell counts. — Parfait, Jan 22 '17 at 02:44

Bernhard · Accepted Answer · 2017-01-22T00:03:00.820

The help file/man page states

one half is subtracted from all |O - E| differences; however,
the correction will not be bigger than the differences themselves.

The differences in your example are all smaller then 0.5:

> chi$observed - chi$expected
            [,1]        [,2]
[1,]  0.06896552 -0.06896552
[2,] -0.06896552  0.06896552

So, at least, it seems to be documented behaviour.

Side note: If in doubt, you could obviously use p-values found by simulation

> chi <- stats::chisq.test(mat, simulate.p.value=TRUE, B=1e6)
> chi

    Pearson's Chi-squared test with simulated p-value (based on 1e+06 replicates)

data:  mat
X-squared = 0.0047386, df = NA, p-value = 1

Which, in this case, finds a chi-square somewhere in the middle and gets rid of the warning. Or use fisher.test...

FWIW this is the chi-squared statistic without correction – Ben Bolker Jan 22 '17 at 04:06 — Ben Bolker, Jan 22 '17 at 04:06

Does R incorrectly compute the chi-squared statistic for 2x2 tables with low cell counts?

1 Answers1