How to interpret t.test in R

Question

I have a dataset with two variables (x1 and x2) from many firms which belong to different industry groups. I calculate the variable "test1" for about 500 firms. We are given the follwing code:

 df$test1 <- df$x1 - df$x2

library(broom)
result.test <- df %>% 
  group_by(industry) %>% do(tidy(t.test(.$test1, alt="two.sided", mu=0)))

The results are grouped by "industry" but it's not clear for me how the t test proceeds. Is the t-test performed for each variable "test1" and then the average result presented in industry group or is the average of "test1" determined for each industry group and then the t-test performed?

I'm a little unclear on your question. There is only one `test1` variable in the data set, so I don't know what you mean by "for each variable `test1`" ... ? — Ben Bolker, Aug 30 '21 at 16:09
There are 500 companies in my data set. The variable "test1" is calculated for each company. I updated my question — newbie090909, Aug 30 '21 at 16:12

score 1 · Accepted Answer · answered Aug 30 '21 at 16:11

So the t test is applied for a subset of each level of industry, here is an example with mtcars:

library(broom)
result.test <-
  mtcars %>% 
  group_by(cyl) %>%
  do(tidy(t.test(.$drat, alt="two.sided", mu=0)))

# A tibble: 3 x 9
# Groups:   cyl [3]
    cyl estimate statistic  p.value parameter conf.low conf.high method            alternative
  <dbl>    <dbl>     <dbl>    <dbl>     <dbl>    <dbl>     <dbl> <chr>             <chr>      
1     4     4.07      36.9 5.03e-12        10     3.83      4.32 One Sample t-test two.sided  
2     6     3.59      19.9 1.04e- 6         6     3.15      4.03 One Sample t-test two.sided  
3     8     3.23      32.4 7.93e-14        13     3.01      3.44 One Sample t-test two.sided

Now,I will filter just for cyl = 4

mtcars %>% 
  filter(cyl == 4) %>% 
  do(tidy(t.test(.$drat, alt="two.sided", mu=0)))

  estimate statistic  p.value parameter conf.low conf.high method            alternative
     <dbl>     <dbl>    <dbl>     <dbl>    <dbl>     <dbl> <chr>             <chr>      
1     4.07      36.9 5.03e-12        10     3.83      4.32 One Sample t-test two.sided

And I got the same result, so it is like applying a t test for each subset of each level of the variable grouped by

score 0 · Answer 2 · answered Aug 30 '21 at 16:53

We may also use nest_by

library(dplyr)
library(tidyr)
library(broom)
mtcars %>%
    nest_by(cyl) %>%
    transmute(out = list(tidy(t.test(data$drat, alt = 'two.sided', 
         mu = 0)))) %>% 
    ungroup %>% 
    unnest(out)

-output

# A tibble: 3 x 9
    cyl estimate statistic  p.value parameter conf.low conf.high method            alternative
  <dbl>    <dbl>     <dbl>    <dbl>     <dbl>    <dbl>     <dbl> <chr>             <chr>      
1     4     4.07      36.9 5.03e-12        10     3.83      4.32 One Sample t-test two.sided  
2     6     3.59      19.9 1.04e- 6         6     3.15      4.03 One Sample t-test two.sided  
3     8     3.23      32.4 7.93e-14        13     3.01      3.44 One Sample t-test two.sided

How to interpret t.test in R

2 Answers2