1

I have a function that compares the columns in each group using the wilcoxon test. Function:

group.leb=c(1,2)
z <- c(2,3,4)
v <- 2
s <- sapply(z,'+',v)
combination <- mapply(c,z,s,SIMPLIFY = F)



wilcox.fun <- function(dat) { 
  do.call(rbind, lapply(combination, function(x) {
    test <- wilcox.test(dat[[x[1]]], dat[[x[2]]], paired=FALSE)
    data.frame(Test = sprintf('Group %s by Group %s', x[1], x[2]), 
               W = round(test$statistic,4), 
               p = test$p.value)
  }))
}

result <- purrr::map_df(split(data, data$group), wilcox.fun, .id = 'Group')

I want to set a parameter so that the function counts for certain groups, and not all in a row.

What do I want to get

|   Group|
|--------|
|  1     |
|  1     |
|  1     |
|  3     |
|  3     |
|  3     |

Or another order for example: (2 and 3)

My data frame:

data <- structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L,1L,1L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), col1 = c(9, 
9.05, 7.15, 7.21, 7.34, 8.12, 7.5, 7.84, 7.8, 7.52, 8.84, 6.98, 
6.1, 6.89, 6.5, 7.5, 7.8, 5.5, 6.61, 7.65, 7.68,8.0,9.0), col2 = c(11L, 
11L, 10L, 1L, 3L, 7L, 11L, 11L, 11L, 11L, 4L, 1L, 1L, 1L, 2L, 
2L, 1L, 4L, 8L, 8L, 1L,3L,4L), col3 = c(7L, 11L, 3L, 7L, 11L, 2L, 11L, 
5L, 11L, 11L, 5L, 11L, 11L, 2L, 9L, 9L, 3L, 8L, 11L, 11L, 2L,5L,6L), 
    col4 = c(11L, 11L, 11L, 11L, 6L, 11L, 11L, 11L, 10L, 7L, 
    11L, 2L, 11L, 3L, 11L, 11L, 6L, 11L, 1L, 11L, 11L,13L,12L), col5 = c(11L, 
    1L, 2L, 2L, 11L, 11L, 1L, 10L, 2L, 11L, 1L, 3L, 11L, 11L, 
    8L, 8L, 11L, 11L, 11L, 2L, 9L,4L,5L)), .Names = c("group", "col1", 
"col2", "col3", "col4", "col5"), class = "data.frame", row.names = c(NA, 
-21L))
GOGA GOGA
  • 407
  • 2
  • 7
  • Hey, where is the `p_value_formatted` function coming from? – elielink Aug 09 '21 at 08:27
  • I'm sorry, this is my function that formats p. value. I removed it from the code – GOGA GOGA Aug 09 '21 at 08:34
  • Thank, just another question, your parameter would exclude a group? or include other? (like would you rather specify 2 to be excluded when calling the function or an object containing the groups to compare? ) – elielink Aug 09 '21 at 08:47
  • Rather include, for example, the parameter `group.lab=c(1,3)` shows that I would like to see only group 1 and 3 in the final table. I hope i explained clearly :) – GOGA GOGA Aug 09 '21 at 09:00
  • Ok, maybe the following command is stupid: since you have to use `map_df` to apply your function to the `data` object, why not filtering the result? Or if you want to do it using a function, just transform this map_df step to implement a filtering step. See the point? – elielink Aug 09 '21 at 09:10
  • do you suggest leaving the result as it is and already removing from it those groups that do not correspond `group.lab=c(1,3)` ( delete group - 2)? – GOGA GOGA Aug 09 '21 at 09:13

1 Answers1

1

Is this doing the trick?

wilcox.fun <- function(df, id_group){
  df = df[df$group%in%id_group,]
 x <- function(dat) { 
  do.call(rbind, lapply(combination, function(x) {
    test <- wilcox.test(dat[[x[1]]], dat[[x[2]]], paired=FALSE)
    data.frame(Test = sprintf('Group %s by Group %s', x[1], x[2]), 
               W = round(test$statistic,4), 
               p = test$p.value)
  }))
 }
 return (purrr::map_df(split(df, df$group), x, .id = 'Group'))
}


wilcox.fun(data, c(1,3))

Output :

       Group               Test    W          p
W...1      1 Group 2 by Group 4 40.0 0.42919530
W1...2     1 Group 3 by Group 5 20.0 0.14199085
W2...3     1 Group 4 by Group 6 38.5 0.51567473
W...4      3 Group 2 by Group 4 33.0 0.95802933
W1...5     3 Group 3 by Group 5  9.0 0.01679008
W2...6     3 Group 4 by Group 6 28.0 0.70822798
elielink
  • 1,174
  • 1
  • 10
  • 22