1

I have a data.list like so:

list(structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), species = structure(c(3L, 3L, 1L, 3L, 3L, 2L, 3L, 1L, 3L, 
1L, 3L, 1L, 3L, 1L, 2L, 4L, 1L, 4L, 2L, 3L, 3L, 3L, 2L, 2L), .Label = 
c("Apiaceae", 
"Ceyperaceae", "Magnoliaceae", "Vitaceae"), class = "factor"), 
N = c(2L, 2L, 3L, 2L, 2L, 1L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 
3L, 1L, 4L, 3L, 4L, 1L, 2L, 2L, 2L, 1L, 1L)), class = "data.frame", 
row.names = c(NA, 
-24L)), structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L), species = structure(c(3L, 3L, 1L, 3L, 3L, 2L, 3L, 1L, 3L, 
1L, 3L, 1L, 3L, 1L, 2L, 4L, 1L, 4L, 2L, 3L, 3L, 3L, 2L, 2L), .Label = 
c("Apiaceae", 
"Ceyperaceae", "Magnoliaceae", "Vitaceae"), class = "factor"), 
N = c(2L, 2L, 3L, 2L, 2L, 1L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 
3L, 1L, 4L, 3L, 4L, 1L, 2L, 2L, 2L, 1L, 1L)), class = "data.frame", 
row.names = c(NA, 
-24L)))

I want to apply my.fun which was written within dplyr package to this list of data. First, I grouped data by "group" and get an output of the function which has already in R and then I applied this function to the data list. But output is 0. there isn't any output. Can you help me to find out the mistake?

 my.fun <- function(x, y){
    group_by(x, !!as.name(group)) %>%
    mutate(out = diversity(N, "shannon")) 
 }

check <- lapply(colnames(list), function(x) {
  my.fun(x$group, x$N)
}) 

Thanks a lot!

R starter
  • 197
  • 12
  • Where is the `y` parameter inside your `my.fun`. Also, if there is no `group` parameter, then you can directly use `group_by(x, group)` as `group` is a column name – akrun Apr 23 '19 at 16:18

1 Answers1

2

Assuming that we are passing group column and the column going on which diversity is applied as strings,

library(tidyverse)
library(vegan)
my.fun <- function(data, grpCol, divCol) {
       data %>% 
           group_by_at(grpCol) %>%
           mutate(out = diversity(!! rlang::sym(divCol), "shannon"))
           #or use mutate_at
           # mutate_at(vars(divCol), list(out = ~ diversity(., "shannon")))
    }

map(lst1, my.fun, grpCol = "group", divCol = "N")
#[[1]]
# A tibble: 24 x 4
# Groups:   group [3]
#   group species          N   out
#   <int> <fct>        <int> <dbl>
# 1     1 Magnoliaceae     2  1.75
# 2     1 Magnoliaceae     2  1.75
# 3     1 Apiaceae         3  1.75
# 4     1 Magnoliaceae     2  1.75
# 5     1 Magnoliaceae     2  1.75
# 6     1 Ceyperaceae      1  1.75
# 7     2 Magnoliaceae     2  2.06
# 8     2 Apiaceae         3  2.06
# 9     2 Magnoliaceae     2  2.06
#10     2 Apiaceae         3  2.06
# … with 14 more rows

#[[2]]
# A tibble: 24 x 4
# Groups:   group [3]
#   group species          N   out
#   <int> <fct>        <int> <dbl>
# 1     1 Magnoliaceae     2  1.75
# 2     1 Magnoliaceae     2  1.75
# 3     1 Apiaceae         3  1.75
# 4     1 Magnoliaceae     2  1.75
# 5     1 Magnoliaceae     2  1.75
# 6     1 Ceyperaceae      1  1.75
# 7     2 Magnoliaceae     2  2.06
# 8     2 Apiaceae         3  2.06
# 9     2 Magnoliaceae     2  2.06
#10     2 Apiaceae         3  2.06
# … with 14 more rows

Note that

identical(lst1[[1]], lst1[[2]])
#[1] TRUE
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you akrun so much! I have a little bit problem to define the group variable. Can we integrate another function (group = my.fun0(x)) after data %>% inside of this my.function? – R starter Apr 23 '19 at 17:26
  • @SunRise You can do that only thing is the `my.fun0(x)` should return a string name for the grouping column – akrun Apr 23 '19 at 17:28
  • @SunRise Also, not sure how the `my.fun0(x)` assuming 'x' is data would automatically return the name of the grouping column. It should have some other parameter as well. In addition, not clear why you wanted to make multiple function calls – akrun Apr 23 '19 at 17:31
  • because group variable should be defined for datalist (for every datasets in this list). x is a column in each datasets which has a name of B_value. – R starter Apr 23 '19 at 17:37
  • @SunRise Let's say you have 'group1', 'group2' as column name for each list element, and it is a vector of user input values, then instead of `map` use `map2` i.e. `map2(lst1, c("group1", "group2"), grpCol = .y, divCol = .N)` – akrun Apr 23 '19 at 17:40
  • Actually I obtained it like so: group <- lapply(list, my.fun0). But when I run above code you defined, it says there is no variable: group. That's why I wanted to include it into my.fun – R starter Apr 23 '19 at 17:42
  • @SunRise It should be `map2(lst1, c("group1", "group2"), ~ my.fun(.x, grpCol = .y, divCol = "N"))` Here, I changed `names(lst1[[1]])[1] <- "group1"; names(lst1[[2]])[1] <- "group2"` Using the same `my.fun` in my post – akrun Apr 23 '19 at 17:43
  • my real data list contains 20 datasets inside of it. Should I continue "group1", "group2"...until "group20"? – R starter Apr 23 '19 at 18:00
  • @SunRise Yes, you can have a vector `v1 <- paste0("group", 1:20)` and use that as `map2(lst1, v1, ~` – akrun Apr 24 '19 at 05:13
  • it is me again. How can I combine all the datasets in the list into a single data? – R starter Apr 24 '19 at 15:38
  • 1
    @SunRise you can use `bind_rows(yourlst)` – akrun Apr 24 '19 at 15:39