0

I'd like to perform a t_test on each group within my data. I'd like to be able to plot the results in ggplot, with each group and t_test in separate facets of facet_grid.

Here's some example data

data <- structure(list(line = c("low", "low", "low", "mid", "mid", "mid", 
"high", "high", "high", "high", "high", "high", "low", "low", 
"low", "mid", "mid", "mid", "high", "high", "high", "high", "high", 
"high"), dose = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), levels = c("0", 
"5000", "1e+05"), class = "factor"), expression = c(1, 1.07876018361211, 
2.74797781407644, 1.06055856449722, 1.26985386969035, 2.24216268515872, 
1.20035861056899, 1.29539908920738, 3.76428705070618, 1.31537212284813, 
1.43373360739086, 3.31565979589036, 1.2, 1.29451222033453, 3.29757337689173, 
1.27267027739667, 1.52382464362842, 2.69059522219046, 1.44043033268279, 
1.55447890704886, 4.51714446084742, 1.57844654741776, 1.72048032886903, 
3.97879175506843), time = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA, -24L
), class = c("tbl_df", "tbl", "data.frame"))

Which looks like this

> head(data)
# A tibble: 6 × 4
  line  dose  expression  time
  <chr> <fct>      <dbl> <dbl>
1 low   0           1        1
2 low   5000        1.08     1
3 low   1e+05       2.75     1
4 mid   0           1.06     1
5 mid   5000        1.27     1
6 mid   1e+05       2.24     1

If this weren't grouped data, I could do the t_test like so

ttest <-
  data %>%
  t_test(expression ~ line) %>%
  add_y_position()
ttest

> ttest
# A tibble: 3 × 12
  .y.        group1 group2    n1    n2 statistic    df     p p.adj p.adj.signif y.position groups      
  <chr>      <chr>  <chr>  <int> <int>     <dbl> <dbl> <dbl> <dbl> <chr>             <dbl> <named list>
1 expression high   low       12     6     0.906 12.5  0.382 0.764 ns                 4.74 <chr [2]>   
2 expression high   mid       12     6     1.31  15.9  0.209 0.627 ns                 5.06 <chr [2]>   
3 expression low    mid        6     6     0.193  8.59 0.851 0.851 ns                 5.39 <chr [2]>   

And the plot the results like this

ggplot(data, aes(x = line, y = expression)) +
  geom_boxplot() +
  stat_pvalue_manual(ttest,
                     label = ifelse("p.adj.signif" %in% names(ttest), "p.adj.signif", "p"),
                     tip.length = 0.01, hide.ns = F)

enter image description here

However, if I add my grouping variables in, I can't get t_test to work

ttest <-
  data %>%
  group_by(dose, time) %>%
  t_test(expression ~ line) %>%
  add_y_position()

Error in `mutate()`:
ℹ In argument: `data = map(.data$data, .f, ...)`.
Caused by error in `map()`:
ℹ In index: 1.
Caused by error in `map()`:
ℹ In index: 1.
Caused by error in `t.test.default()`:
! not enough 'y' observations
Run `rlang::last_error()` to see where the error occurred.

What I want is to plot it facetted like this, but with the t_test results in each facet

ggplot(data, aes(x = line, y = expression)) +
  facet_grid(time ~ dose) +
  geom_boxplot()

enter image description here

Any ideas? I would also like to be able to export a table of t_test results.

Mike
  • 921
  • 7
  • 26
  • The specific error you are getting is because most groups in your fully facted data only have a single observation. You can't run a t-test if one of the groups only has a single observation. – Allan Cameron Feb 20 '23 at 22:56

0 Answers0