I have some trouble using the chisq.test command in R : I got different and weird results according to how I use the data.
Let's say I have the following table named t
:
> t
data1 data2 data3 data4 data5
1487 3301 2983 2432 6151
1296 1519 1354 1244 3139
1169 867 837 916 2191
1372 681 802 1065 1749
1497 630 962 1256 1304
1502 544 1097 1380 942
1344 477 1200 1410 673
1031 346 1199 1286 347
705 172 975 980 170
542 90 919 770 66
276 26 1005 604 10
I'm doing chi2 tests between columns but I don't understand :
When I do chisq.test(x=t[,1], y=t[,2])
, I got :
X-squared = 110, df = 100, p-value = 0.2322
which is the same result than when I do :
data1 <- c(1487, 1296, 1169, 1372, 1497, 1502, 1344, 1031, 705, 542, 276)
data2 <- c(3301, 1519, 867, 681, 630, 544, 477, 346, 172, 90, 26)
chisq.test(x=data1, y=data2)
But is different than :
t2 <- matrix(c(data1, data2), ncol=11, nrow=2, byrow=T)
chisq.test(t2)
X-squared = 2865.8, df = 10, p-value < 2.2e-16
According to the degrees of freedom, I guess the last one is correct,but what is happening here ? Moreover, I got the same pvalues whatever the columns I choose to use in the test ...