I have the following data frame:
library(tidyverse)
dat <- structure(list(charge.Group3 = c(0.167, 0.167, 0.1, 0.067, 0.033,
0.033, 0.067, 0.133, 0.2, 0.067, 0.133, 0.114, 0.167, 0.033,
0.1, 0.033, 0.133, 0.267, 0.133, 0.233, 0.1, 0.167, 0.067, 0.133,
0.1, 0.133, 0.1, 0.133, 0.1, 0.067, 0.167, 0), hydrophobicity.Group3 = c(0.267,
0.467, 0.067, 0.167, 0.267, 0.1, 0.367, 0.233, 0.367, 0.233,
0.133, 0.205, 0.333, 0.267, 0.267, 0.067, 0.133, 0.3, 0.233,
0.267, 0.5, 0.333, 0.2, 0.5, 0.5, 0.4, 0.033, 0.3, 0.233, 0.5,
0.233, 0.033), class = c("Negative", "Negative", "Positive",
"Positive", "Positive", "Positive", "Positive", "Negative", "Positive",
"Positive", "Positive", "Positive", "Positive", "Positive", "Negative",
"Positive", "Negative", "Negative", "Negative", "Negative", "Negative",
"Negative", "Negative", "Negative", "Negative", "Negative", "Positive",
"Positive", "Positive", "Negative", "Positive", "Negative")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -32L))
dat
#> # A tibble: 32 x 3
#> charge.Group3 hydrophobicity.Group3 class
#> <dbl> <dbl> <chr>
#> 1 0.167 0.267 Negative
#> 2 0.167 0.467 Negative
#> 3 0.1 0.067 Positive
#> 4 0.067 0.167 Positive
#> 5 0.033 0.267 Positive
#> 6 0.033 0.1 Positive
#> 7 0.067 0.367 Positive
#> 8 0.133 0.233 Negative
#> 9 0.2 0.367 Positive
#> 10 0.067 0.233 Positive
#> # ... with 22 more rows
What I want to do for each features: charge.Group3
and hydrophobicity.Group3
, perform wilcox.test
between Negative and positive class. And finally get the p-value as data frame or tibble:
features pvalue
charge.Group3 0.1088
hydrophobicity.Group3 0.03895
# I do by hand.
Note that there are actually more than 2 features. How can I achieve that?