Compute one sample t-test for each column of a data frame and summarize results in a table

Question

Here is some sample data on my problem:

mydf <- data.frame(A = rnorm(20, 1, 5),
                   B = rnorm(20, 2, 5),
                   C = rnorm(20, 3, 5),
                   D = rnorm(20, 4, 5),
                   E = rnorm(20, 5, 5))

Now I'd like to run a one-sample t-test on each column of the data.frame, to prove if it differs significantly from zero, like t.test(mydf$A), and then store the mean of each column, the t-value and the p-value in a new data.frame. So the result should look something like this:

      A    B    C    D    E
mean  x    x    x    x    x
t     x    x    x    x    x
p     x    x    x    x    x

I could definitely think of some tedious ways to do this, like looping through mydf, calculating the parameters, and then looping through the new data.frame and insert the values.
But with packages like plyr at hand, shouldn't there be a more concise and elegant way to do this?

Any ideas are highly appreciated.

[This](http://stackoverflow.com/questions/13109652/r-output-without-1-how-to-nicely-format) also might help you if you are using `regress`. — Metrics, Jun 29 '13 at 20:56

Thomas · Accepted Answer · 2013-06-29T20:53:57.650

Try something like this and then extract the results you want from the resulting table:

results <- lapply(mydf, t.test)
resultsmatrix <- do.call(cbind, results)
resultsmatrix[c("statistic","estimate","p.value"),]

Gives you:

          A         B          C            D           E           
statistic 1.401338  2.762266   5.406704     3.409422    5.024222    
estimate  1.677863  2.936304   5.418812     4.231458    5.577681    
p.value   0.1772363 0.01240057 3.231568e-05 0.002941106 7.531614e-05

agstudy · Answer 2 · 2013-06-29T21:16:34.163

1

a data.table solution :

library(data.table)
DT <- as.data.table(mydf)
DT[,lapply(.SD,function(x){
         y <- t.test(x)
         list(p = round(y$p.value,2),
              h = round(y$conf.int,2),
              mm = round(y$estimate,2))})]

           A          B         C         D         E
1:        0.2       0.42      0.01         0         0
2: -0.91,3.98 -1.15,2.62 1.19,6.15 2.82,6.33 2.68,6.46
3:       1.54       0.74      3.67      4.57      4.57

edited Jun 29 '13 at 21:16

answered Jun 29 '13 at 20:37

agstudy

119,832
17
199
261

1

Might be nice to have row names. Also, I tried to format your code, but it just requires a carriage return to format correctly, so I didn't hit the 6 character minimum edit. – Thomas Jun 29 '13 at 20:56
@Thomas thanks. I was away. But there isn't rownames with data.table. – agstudy Jun 29 '13 at 21:18
Is there a conceptual advantage of `data.table` that justifies the additional code, compared to the solution from @Thomas ? – vincentqu Jun 30 '13 at 17:23

Compute one sample t-test for each column of a data frame and summarize results in a table

2 Answers2

Linked