-1

I have data.frame df1:

 set.seed(12345)
 df1 <- data.frame(group=c(rep("apple", 4), rep("pear",6)), a=rnorm(10,0,0.4), 
      b=rnorm(10,0,0.2), 
      c=rnorm(10,0,0.7), d=rnorm(10,0,0.9), e=rnorm(10,0,0.5))

How can I get the column-wise wilcox.test p values comparing apple (rows 1:4) to pear (rows 5:10) and add that p value to a new row at the bottom, resulting in df2:

 df2 <- data.frame(group=c(rep("apple", 4), rep("pear",6), "wilcox.test"), 
      a=c(rnorm(10,0,0.4), 0.393768635), b=c(rnorm(10,0,0.2), 0.286422023), 
      c=c(rnorm(10,0,0.7), 1), d=c(rnorm(10,0,0.9), 0.033006258), 
      e=c(rnorm(10,0,0.5), 1))


 > df2
    group   a   b   c   d   e
 1  apple   0.23421153  -0.02324956 0.5457353   0.73068586  0.5642554
 2  apple   0.28378641  0.36346241  1.0190496   1.97715019  -1.190179
 3  apple   -0.04372133 0.07412557  -0.4510299  1.8442713   -0.5301328
 4  apple   -0.18139887 0.10404329  -1.0871962  1.46920108  0.4685703
 5  pear    0.24235498  -0.1501064  -1.1183967  0.22884407  0.4272259
 6  pear    -0.72718239 0.16337997  1.2635683   0.44206945  0.7303647
 7  pear    0.25203942  -0.1772715  -0.3371532  -0.29167792 -0.7065494
 8  pear    -0.11047364 -0.06631552 0.4342659   -1.49584522 0.2837016
 9  pear    -0.1136639  0.22414253  0.4284864   1.59096047  0.2915938
 10 pear    -0.3677288  0.05974474  -0.1136177  0.02322094  -0.6533994
 11 wilcox.test 0.393768635 0.286422023 1   0.033006258 1
Sylvia Rodriguez
  • 1,203
  • 2
  • 11
  • 30

2 Answers2

2

Here is a base R option -

subset the data for two values in them, use Map to apply wilcox.test for every column and extract the p-value from it, add it as a new row in the already existing df1.

rbind(df1, data.frame(group = 'wilcox.test', 
              mapply(function(x, y) wilcox.test(x, y)$p.value, 
                     subset(df1, group == 'apple', select = -group),
                     subset(df1, group == 'pear', select = -group)) |>
                     t() |> data.frame()))

#         group           a           b          c           d          e
#1        apple  0.23421153 -0.02324956  0.5457353  0.73068586  0.5642554
#2        apple  0.28378641  0.36346241  1.0190496  1.97715019 -1.1901790
#3        apple -0.04372133  0.07412557 -0.4510299  1.84427130 -0.5301328
#4        apple -0.18139887  0.10404329 -1.0871962  1.46920108  0.4685703
#5         pear  0.24235498 -0.15010640 -1.1183967  0.22884407  0.4272259
#6         pear -0.72718239  0.16337997  1.2635683  0.44206945  0.7303647
#7         pear  0.25203942 -0.17727150 -0.3371532 -0.29167792 -0.7065494
#8         pear -0.11047364 -0.06631552  0.4342659 -1.49584522  0.2837016
#9         pear -0.11366390  0.22414253  0.4284864  1.59096047  0.2915938
#10        pear -0.36772880  0.05974474 -0.1136177  0.02322094 -0.6533994
#11 wilcox.test  0.47619048  0.35238095  1.0000000  0.03809524  1.0000000

Used pipes (|>) from R 4.1 for readability.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Wow, great solution! ¡Gracias! – Sylvia Rodriguez Sep 12 '21 at 08:25
  • Is it also possible to add another row with the adjusted p values using ```p.adjust``` (stats package)? That would be incredible! – Sylvia Rodriguez Sep 12 '21 at 08:39
  • 1
    I am not sure how that would work here. `p.adjust` (as far as I read the documentation) takes only one vector (`x` and not `x` and `y` like `wilcox.test`) Moreover, `p.adjust` returns output of same length as input (`wilcox.test(x, y)$p.value` returns always only one number). I think it would be better if you ask that as a new question with an example input and expected output. – Ronak Shah Sep 12 '21 at 08:46
  • Thank you Ronak. I will try extracting the ```p.values``` from the ```wilcox.test``` row into a vector, perform ```p.adjust``` and ```rbind``` the ```p.adjust``` vector to ```df1```. – Sylvia Rodriguez Sep 12 '21 at 08:52
  • 1
    I see. If you save the output from above in `df1`, you may extract the last row and apply `p.adjust` on that. Try `rbind(df1, data.frame(group = 'p.adjust', df1[nrow(df1), -1] |>p.adjust() |> t() |> as.data.frame()))` – Ronak Shah Sep 12 '21 at 09:01
  • Thank you again, Ronak. ¡That's awesome! – Sylvia Rodriguez Sep 12 '21 at 10:35
1

We can do it with dplyr. We can summarisewith wilcox.test, extract the p.value with $, then use bind_rows to bind the p.values and adjusted p.values as the last rows.

df1 %>% summarise(across(!group, ~wilcox.test(.x ~ group)$p.value)) %>%
        bind_rows(., p.adjust(., method = 'bonferroni')) %>%
        bind_rows(df1, .) %>%
        mutate(group=replace(group, is.na(group), c('p.values', 'adjusted_p.values')))

               group           a           b          c           d          e
1              apple  0.23421153 -0.02324956  0.5457353  0.73068586  0.5642554
2              apple  0.28378641  0.36346241  1.0190496  1.97715019 -1.1901790
3              apple -0.04372133  0.07412557 -0.4510299  1.84427130 -0.5301328
4              apple -0.18139887  0.10404329 -1.0871962  1.46920108  0.4685703
5               pear  0.24235498 -0.15010640 -1.1183967  0.22884407  0.4272259
6               pear -0.72718239  0.16337997  1.2635683  0.44206945  0.7303647
7               pear  0.25203942 -0.17727150 -0.3371532 -0.29167792 -0.7065494
8               pear -0.11047364 -0.06631552  0.4342659 -1.49584522  0.2837016
9               pear -0.11366390  0.22414253  0.4284864  1.59096047  0.2915938
10              pear -0.36772880  0.05974474 -0.1136177  0.02322094 -0.6533994
11          p.values  0.47619048  0.35238095  1.0000000  0.03809524  1.0000000
12 adjusted_p.values  1.00000000  1.00000000  1.0000000  0.19047619  1.0000000
GuedesBF
  • 8,409
  • 5
  • 19
  • 37