Writing for loop or function to calculate p values for different dataframes

Question

I am trying to iteratively carry out a T test on Value 1, row1, column1 in dataframe1 in comparison to Value 1 in dataframe 2.

Simulation dataframe 1 and 2

DF1<- 
  data.frame(Sample.Name = 
            
c("A1_VAR_A", "A2_VAR_A", "A3_VAR_A", "A4_VAR_A", "A5_VAR_A", 
  "A6_VAR_A","B1_VAR_A", "B2_VAR_A", "B3_VAR_A", "B4_VAR_A", "B5_VAR_A", "B6_VAR_A"),
Compound1 = runif(12,0,100),
Compound2 = runif(12,0,100),
Compound3 = runif(12,0,100),
Compound4 = runif(12,0,100),
Compound5 = runif(12,0,100),
Compound6 = runif(12,0,100),
Compound7 = runif(12,0,100),
Compound8 = runif(12,0,100),
Compound9 = runif(12,0,100),
Compound10 = runif(12,0,100),
Compound11 = runif(12,0,100),
Compound12 = runif(12,0,100))

DF2 <- 
  data.frame(Sample.Name = 
               
               c("A1_VAR_B", "A2_VAR_B", "A3_VAR_B", "A4_VAR_B", "A5_VAR_B", 
                 "A6_VAR_B","B1_VAR_B", "B2_VAR_B", "B3_VAR_B", "B4_VAR_B", 
                 "B5_VAR_B", "B6_VAR_B"),
             Compound1 = runif(12,0,100),
             Compound2 = runif(12,0,100),
             Compound3 = runif(12,0,100),
             Compound4 = runif(12,0,100),
             Compound5 = runif(12,0,100),
             Compound6 = runif(12,0,100),
             Compound7 = runif(12,0,100),
             Compound8 = runif(12,0,100),
             Compound9 = runif(12,0,100),
             Compound10 = runif(12,0,100),
             Compound11 = runif(12,0,100),
             Compound12 = runif(12,0,100))

So the comparison (Using T test) is between the compound 1 of A1_VarianceA(DF1) and A1_VarianceB(DF2), compound 2 of A1_VarianceB and A1_VarianceB and so on (horizontally) and then the same for compound 1 of A2_VarianceA (DF1) and A2_VarianceB (DF2) and so on for the rest of the dataset(s). Any other test is also welcome to determine the variance, as long as I get pushed in the right direction. Is there also a visualisation plot I could also implement to display the variance in the total data set after the T test?

Thank you in advance for the help and advice !

I've tried to reconcile the dataset and tried the for loop, but the data seemed to be messed up after it with lots of N/A's. Afterwords I also tried a nested for loop, but also couldn't find any promising results.

score 0 · Answer 1 · answered Mar 25 '22 at 11:06

0

To get the p values, you can asplit both data frames into to their component rows, and Map them together to get the p values from their respective 12 t tests.

pvals <- unlist(Map(function(a, b) t.test(a, b)$p.value, 
              a = asplit(DF1[-1], 1), 
              b = asplit(DF2[-1], 1)))

pvals
#> [1] 0.569022938 0.754087768 0.124998634 0.009124122 0.711117855
#> [6] 0.994764839 0.458249350 0.989509306 0.467061279 0.243277334
#> [11] 0.522428409 0.982656590

answered Mar 25 '22 at 11:06

Allan Cameron

147,086
7
49
87

I have a bigger dataset and this unfortunately doesn't do the trick. I only get a numeric vector of 21 datapoint whereas I have 180 values that need validation. Since I have a larger dataset, I would like to make a df with the names of each compound and their according sample name. – Akki Mar 28 '22 at 10:38

Writing for loop or function to calculate p values for different dataframes

1 Answers1