One-sample T-test Over Multiple Columns with Multiple mu Values in R

Question

I have several datasets, each for a particular time point, and each containing several measures. For each of them, I want to conduct a one-sample t-test on each measure, so across all the columns. Each measure has a different mu value that I want to compare my results with. I have tried creating a function to do this so I only have to give it the name of the dataset as an argument. I have created a list of mu values. However, the function won't accept this and I get an error. Here is an example dataset:

t1 <- rnorm(20, 10, 1)
t2 <- rnorm(20, 10, 1)
t3 <- rnorm(20, 10, 1)
test_data <- data.frame(t1, t2, t3)

And the lists of mu values and variables:

muvals <- c(24, 51.8, 21.89)
varlist <- c(t1, t2, t3)

This is my attempt at the function:

onett <- function(tpoint) {
  t.test(tpoint$varlist, mu = muvals)
}

And the error message I get is: Error in t.test.default(tpoint$varlist, mu = muvals) : 'mu' must be a single number

Is there a way to get this function to work, or otherwise iterate through each column and the list of mu values?

Edit: Each mu value only applies to one column. So the first value for the first column, etc.

There's a good few ways of looping in R, some tidier than others! Can prep an answer but it would be helpful to know what sort of output you're looking for. Do you want it to print all (nine) t-test results? Or store outputs somewhere? — Andy Baxter, Dec 02 '21 at 16:43
Yes, it would be helpful if I can store the values of the tests — muriosity, Dec 03 '21 at 14:13

Andy Baxter · Accepted Answer · 2021-12-04T20:49:21.967

To iterate over every combination of each column and mu value and simply print out the results of all t-tests the purrr::cross2 function would give you a list of all column/mu combinations and purrr::map would loop over the tests:

library(purrr)

t1 <- rnorm(20, 10, 1)
t2 <- rnorm(20, 10, 1)
t3 <- rnorm(20, 10, 1)
test_data <- data.frame(t1, t2, t3)

onett <- function(data) {
  muvals <- c(24, 51.8, 21.89)
  map(cross2(data, muvals), ~ t.test(.x[[1]], mu = .x[[2]]))
}

onett(test_data)
#> Prints t-test results...

Edit #1

From your clarification of question, it looks like map2 would do the simultaneous iteration over two objects the same length. To make a function you'd pass the data to, I'd suggest something like the following:

library(purrr)
library(dplyr)
library(tidyr)

t1 <- rnorm(20, 10, 1)
t2 <- rnorm(20, 10, 1)
t3 <- rnorm(20, 10, 1)
test_data <- data.frame(t1, t2, t3)


# (Can work best to have `muvals` defined in function rather than environment)

onett <- function(data, muvals = c(24, 51.8, 21.89)) {
  map2(data, muvals, function(data, mu) t.test(data, mu = mu))
}

onett(test_data) %>% 
  map_dfr(broom::tidy)

#> # A tibble: 3 x 8
#>   estimate statistic  p.value parameter conf.low conf.high method    alternative
#>      <dbl>     <dbl>    <dbl>     <dbl>    <dbl>     <dbl> <chr>     <chr>      
#> 1    10.1      -50.4 1.07e-21        19     9.50      10.7 One Samp~ two.sided  
#> 2    10.3     -187.  1.65e-32        19     9.83      10.8 One Samp~ two.sided  
#> 3     9.99     -47.8 2.87e-21        19     9.47      10.5 One Samp~ two.sided

The function outputs the list of t-test results. You can used broom::tidy to extract all t statistics, p-values etc. (shown above), or incorporate that into the function, or tidy the output within the function to give what you need.

^{Created on 2021-12-04 by the reprex package (v2.0.1)}

Hi there. Sorry I wasn't clear. Each mu value would only apply to one column. So the first value in the list is for the first column, the second value for the second column, etc. Would this work without using cross2? Or would I need to use map2? — muriosity, Dec 03 '21 at 14:11
Hi @miku - ah sorry that's what I had thought you were looking for the first time, and presumed I had got it wrong. You can use `map2` quite easily for this - will update my solution. — Andy Baxter, Dec 04 '21 at 20:34
That's amazing @Andrew Baxter. Thank you so much. Could you possibly explain why it is better to define muvals in the function rather than the environment? — muriosity, Dec 06 '21 at 09:33
Glad it helped :). Basically, if defined in function arguments (as above), then you can make sure it runs the same way every time. If it looks for `muvals` in the environment then a) you need to make sure you run that line first and b) if `muvals` gets changed somehow then the function runs differently next time (as it looks up `muvals` again - see https://adv-r.hadley.nz/functions.html#lexical-scoping for helpful rundown). You could put the definition *in* the function, but putting it in the list of arguments to pass to the function allows you to pass new values to same function if needed. — Andy Baxter, Dec 07 '21 at 10:35

Yacine Hajji · Answer 2 · 2021-12-02T17:01:03.593

There may be shorter ways but here is a proposition to test all your samples with all your mu values: you store p-values into a data-frame.

You will find below a function where you can specify both your sample and your mu value ; then you could create a data-frame to store p-values.

# 1- Simulating samples into a data-frame
set.seed(1)
for(k in 1:3){ 
  assign(paste("t", k, sep=""), rnorm(20, 10, 1)) 
}
test_data <- data.frame(t1, t2, t3)

# 2- Choosing mu values to test
muvals <- c(24, 51.8, 21.89)

# 3- Creating function which depends on both your sample and your mu value
onett <- function(tPoint, muValue) {
  t.test(tPoint, mu=muValue)$p.value
}

# 4- Creating a data-frame for p-values storage with your mu values as row-names and your sample name as column-name
dfPvalues <- data.frame(matrix(NA, length(muvals), ncol(test_data)), row.names=muvals)
colnames(dfPvalues) <- colnames(test_data)

# 5- Filling the p-value data-frame through a loop
for(i in 1:nrow(dfPvalues)){
  for(j in 1:ncol(dfPvalues)){
    dfPvalues[i, j] <- onett(tPoint=test_data[,j], muValue=muvals[i])
  }
}

One-sample T-test Over Multiple Columns with Multiple mu Values in R

2 Answers2

Edit #1