I just noticed that for 2 x 2 tables where cells have low frequencies, even with the Yates correction, R
seems to be computing chi^2 statistics incorrectly.
mat <- matrix(c(3, 2, 14, 10), ncol = 2)
chi <- stats::chisq.test(mat)
## Warning message:
## In stats::chisq.test(mat) : Chi-squared approximation may be incorrect
# from the function
chi$statistic
## X-squared
## 1.626059e-31
# as it should be (with Yates correction)
sum((abs(chi$observed - chi$expected) - 0.5)^2 / chi$expected)
## [1] 0.1851001
Am I right in thinking that R
is computing it incorrectly, and that the second method yielding .185 is more accurate? Or do the small cell counts mean that all bets are off?
Update:
It does seem to work fine without the Yates continuity correction:
chi <- stats::chisq.test(mat, correct = FALSE)
## Warning message:
## In stats::chisq.test(mat, correct = FALSE) :
## Chi-squared approximation may be incorrect
chi$statistic
## X-squared
## 0.004738562
sum((abs(chi$observed - chi$expected))^2 / chi$expected)
## [1] 0.004738562