how to perform t-test on multiple columns of a dataframe

Question

I have df that consists of 7 columns each corresponding to a chromosome. I would like to see if the values in each of the chromosome is statistically significant with other chromosomes. Here is the sub-set of the df...

       A01      A02       A03       A04       A05       A06       A07
1 0.0475424 0.224646 0.1065940 0.1580800 0.0279520 0.8189890 0.2721350
2 0.0383661 0.133959 0.0579846 0.0300916 0.1662380 0.0735981 0.2863390
3 0.2999830 0.407670 0.1696190 0.0379608 0.0544481 0.1532610 0.1041220
4 0.1605930 0.729948 0.0642579 0.4513340 0.3155020 0.3234300 0.7930150
5 0.5301730 0.100597 0.1850310 0.1111630 0.1000220 0.2172030 0.0748173
6 0.0268711 1.278470 0.0958172 0.5504090 0.3600080 0.0355549 0.3678820

I know i can just use t-test to compare "A01" to "A02" and so on. But it will tell me if those two chromosome are significant or not but my plan is to compare A01 with all other chromosomes. How can i do that?

Thanks

Upendra

Aside from any technicalities, this seems a little dubious from a statistical/data fishing point of view. — thelatemail, May 11 '14 at 23:54

score 1 · Answer 1 · edited May 23 '17 at 12:33

1

You could reference this post, which was the first link when Googling "t test multiple columns R".

Using the reshape2 package and a pairwise.t.test, and assuming dat is your data...

> library(reshape2)
> meltdf <- melt(dat)
> pairwise.t.test(meltdf$value, meltdf$variable, p.adjust = "none")
#  Pairwise comparisons using t tests with pooled SD 
# 
# data:  meltdf$value and meltdf$variable 
# 
#     A01   A02   A03   A04   A05   A06  
# A02 0.056 -     -     -     -     -    
# A03 0.639 0.019 -     -     -     -    
# A04 0.794 0.095 0.466 -     -     -    
# A05 0.930 0.046 0.703 0.727 -     -    
# A06 0.566 0.171 0.300 0.754 0.509 -    
# A07 0.381 0.283 0.182 0.536 0.336 0.760
# 
# P value adjustment method: none

edited May 23 '17 at 12:33

Community

1
1

answered May 12 '14 at 00:11

Rich Scriven

97,041
11
181
245

I saw the reference link but still posted the question since i am not interested in pairwise comparisons but would like to compare each chromosome with the rest of 6 chromosomes. Can this be done? – upendra May 12 '14 at 00:47
You mean like `A01 ~ all the others`, then `A02 ~ all the others`, etc? – Rich Scriven May 12 '14 at 00:51
That's right @Richard. `A01 vs all (except A01)`, then `A02 vs all (except A02)` and so on..... – upendra May 12 '14 at 01:04
1

Are you sure you want a t-test then? t-tests just compares all the variables with a null hypothesis: mean equals zero. You may want analysis of variance. – Rich Scriven May 12 '14 at 01:09
you mean i can do this `aov(lm(dat1$A01 ~ dat$notA01))` and the `notA01` is summing of all the rows in the `dat` except `AO1`? – upendra May 12 '14 at 01:29

score 0 · Answer 2 · answered May 12 '14 at 14:47

Welch-test can be used to compare the means of normally distributed vectors with different lengths and standard errors. That test is the one used by default in R t.test()function.

Therefore, I guess you could do what you want aggregating all vectors from A02 to the end and compare it with A01 with t.test(if they are normally distributed, which you can check with ks.testor shapiro.test).

Then, have the tested vector varying with a loop and that should do the trick.

However, you will have to fix the multiple testing issue manually (which is not so hard).

how to perform t-test on multiple columns of a dataframe

2 Answers2