0

I am new with R. If you could help me that would be great. My problem is as follows:

Lets say I have 5 groups, Group1, Group2, Group3, Group4 and Group5, each containing 100 data points.

Now I want to compare these groups with each other, using either t-test or ks-test and want to generate a matrix of p-values. Essentially, there would a 5x5 matrix of p-values. I have done similar kind of work with correletions using corr.mat function. Here, 5 groups are for just illustrative purpose, at the end of the day I ahve to do it on almost 250 groups thus I have to generate a matrix of 250x250 containing p-values.

If anyone of you could help me to achieve this, it would be much kind of you.

Things I know in R so far:

Load the data into R by loading .csv file:

my.data = read.csv(file.choose())
attach(your.data)
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138

1 Answers1

1

If you know how to compute an individual p-value, you can just put that code in a loop.

# Sample data
d <- data.frame(
  group = paste( "group", rep(1:5, each=100) ),
  value = rnorm( 5*100 )
)

# Matrix to store the result
groups <- unique( d$group )
result <- matrix(NA, nc=length(groups), nr=length(groups))
colnames(result) <- rownames(result) <- groups

# Loop
for( g1 in groups ) {
  for( g2 in groups ) {
    result[ g1, g2 ] <- t.test( 
      d$value[ d$group == g1 ], 
      d$value[ d$group == g2 ]
    )$p.value              
  }
}
result

#           group 1   group 2   group 3   group 4   group 5
# group 1 1.0000000 0.6533393 0.7531349 0.6239723 0.6194475
# group 2 0.6533393 1.0000000 0.9047020 0.9985489 0.3316215
# group 3 0.7531349 0.9047020 1.0000000 0.8957871 0.4190027
# group 4 0.6239723 0.9985489 0.8957871 1.0000000 0.2833226
# group 5 0.6194475 0.3316215 0.4190027 0.2833226 1.0000000

You could also use outer:

groups <- unique( d$group )
outer( 
  groups, groups, 
  Vectorize( function(g1,g2) {
    t.test( 
      d$value[ d$group == g1 ], 
      d$value[ d$group == g2 ]
    )$p.value
  } )
)
Vincent Zoonekynd
  • 31,893
  • 5
  • 69
  • 78
  • Thank you very much for your effort. But given that R is entirely new for me thus I am still having problem regarding implementation.Here what I am: I have loaded the data into R and individual group, in my case called as wave0, wave1, wave2, wave3, wave4 and wave5, are available for reading. When I type wave0 it shows me all the 100 data points and so for wave1, wave2, wave3, wave4 and wave5. Now could you please provide the code for generation of matrix to store p-values and the for loop for directly reading wave0, wave1, wave2, wave3, wave4 and wave5 and perform t-test. Again many thanks! – user3077726 Dec 07 '13 at 17:33