How to iteratively compute p-values for t-test

Question

a) Generate 50 values from X ~ N (μX= 25, σX = 4) and 50 values from Y ~ N (μY= 25, σY = 4). Use a t-test to test for equality of the means.

c) Repeat part (a) 2500 times, and retain the p-value for each of the 2500 tests. Each repetition should generate a new sample for x and a new sample for y. DO NOT PRINT the p-values. DO NOT use a loop.

I solved for Part A on one rnorm sample but I'm confused on where to start to get 2500 different random samples of x and 2500 different random samples of y to get 2500 different p-values.

I also don't know how to make sure to write my code so that my professor will get the same answers I did. I tried setting the seed but that only makes it so the p-values are all the same using my code above.

# Part A

set.seed(1081)
x = rnorm(50,25,4)
y = rnorm(50,25,4)

t.test(x,y)

#Part B
#The p-value is 0.3752.
#We do not reject the null hypothesis.

#Part C

x1 = sample(x, 2500, replace = T)
y1 = sample(y, 2500, replace = T)
pval = sample(t.test(x1,y1)$p.value, 2500, replace = T)

ilias23 · Answer 1 · 2019-04-18T21:50:48.757

1

Another approach is this:

    library(MASS)       #load MASS library

    s <- 4*diag(2500)   #create the variance matrix for the simulation
    set.seed(123)        # seed to replicate results

    x <- mvrnorm( 50, m= rep(25,times=2500), Sigma=s)  #draw 50 values, 25000 times 

    y <- mvrnorm( 50, m = rep(25, times=2500), Sigma=s) #draw 50 values, 2500 times

    diff <- x - y

    test <- apply(diff,2,t.test) #do the t.tests

    names(test) #some of the results you can print

If you have questions about the code you can ask me.

edited Apr 18 '19 at 21:50

answered Apr 18 '19 at 21:24

ilias23

11
2

Why do we take the difference of x and y? – Ross Fosher Apr 18 '19 at 22:06
When you want to perform a t- test between two samples (hence x,y) it's the same to test if their's difference is zero. So in order to do one t-test, you can create a new sample z ( z= x-y) and the do the test, t.test(z, mu=0). You will get barely the same results with the two sample test t.test(x,y). – ilias23 Apr 19 '19 at 21:37

score 1 · Answer 2 · answered Apr 19 '19 at 08:59

another possibility is to use replicate:

Note that you have to set the random seed outside of the function.

myfun <- function(){
  x <- rnorm(50, 25, 4)
  y <- rnorm(50, 25, 4)

  return(t.test(x, y)$p.value)
}


set.seed(1)
p_vals <- replicate(2500, myfun())

score 0 · Answer 3 · answered Apr 19 '19 at 19:44

Yet another possibility is:

set.seed(1081)
n <- 50
times <- 2500
x <- data.frame(matrix(rnorm(n*times, mean=25, sd=4), nrow=n))
y <- data.frame(matrix(rnorm(n*times, mean=25, sd=4), nrow=n))
pvals <- mapply(FUN = function(x,y) t.test(x,y)$p.value, x, y)
mean(pvals < .05)  # should be ~= .05

Loop simultaneously over two lists in R (jogo's comment)

But if we take "each repetition should generate new samples" literally, @Cettt's answer may be what is wanted.

How to iteratively compute p-values for t-test

3 Answers3