Final update (using nest_by
& across
)
Inspired by @Brunos answer, I revised my approach to use nest_by
/ rowwise
instead of map
(which is, I guess, the new recommended way of wrangling nested tibbles).
The result of my original answer can be easily reproduced using nest_by
:
gapminder %>%
nest_by(country, continent) %>%
mutate(sample_res = list(transmute(data,
across(c(year, lifeExp, pop),
~ sample_function(data, var_a = .x, var_b = gdpPercap))
))
)
However, it returns one list-column containing tibble
s. If the output were normal vectors we could just remove sample_res = list()
and new columns would be added to your existing tibble. However, in this example the output of each new column is a list-column containing vectors. I have not managed to produce this output in one call to mutate(across(...))
.
It is possible though to use unnest
and then another call to summarise(across(...))
to get the job done.
gapminder %>%
nest_by(country, continent) %>%
mutate(sample_res = list(transmute(data,
across(c(year, lifeExp, pop),
~ sample_function(data, var_a = .x, var_b = gdpPercap))
))
) %>%
unnest(cols = sample_res) %>%
summarise(across(c(year, lifeExp, pop), list, .names = "res_{col}"))
Original answer (using group_by
, nest
, map
& across
)
You misspecified sample_function
in your across
call. It should be
function(x) sample_function(.x, var_a = x, var_b = gdpPercap)
instead of
~ sample_function(var_a = .x),
var_b = gdpPercap
Since you are nesting map
and mutate(across(...))
, I prefer to have at least one "normal" anonymous functions instead of the lamda ~
notation. Otherwise, things can get confusing with two .x
s.
Further across
should be called inside its own separate mutate
.
This should work:
library("tidyverse")
data("gapminder", package = "gapminder")
# Sample function
sample_function <- function(.data, var_a, var_b) {
var_a <- enquo(var_a)
var_b <- enquo(var_b)
.data %>%
mutate(some_res = log(!!var_a) + !!var_b) %>%
pull(some_res)
}
gapminder %>%
group_by(country, continent) %>%
nest() %>%
mutate(sample_res = map(
data,
~ mutate(.x, across(c(year, lifeExp, pop),
function(x) {
sample_function(.x, var_a = x, var_b = gdpPercap)
}
)
)
)
)
#> # A tibble: 142 x 4
#> # Groups: country, continent [142]
#> country continent data sample_res
#> <fct> <fct> <list> <list>
#> 1 Afghanistan Asia <tibble [12 × 4]> <tibble [12 × 4]>
#> 2 Albania Europe <tibble [12 × 4]> <tibble [12 × 4]>
#> 3 Algeria Africa <tibble [12 × 4]> <tibble [12 × 4]>
#> 4 Angola Africa <tibble [12 × 4]> <tibble [12 × 4]>
#> 5 Argentina Americas <tibble [12 × 4]> <tibble [12 × 4]>
#> 6 Australia Oceania <tibble [12 × 4]> <tibble [12 × 4]>
#> 7 Austria Europe <tibble [12 × 4]> <tibble [12 × 4]>
#> 8 Bahrain Asia <tibble [12 × 4]> <tibble [12 × 4]>
#> 9 Bangladesh Asia <tibble [12 × 4]> <tibble [12 × 4]>
#> 10 Belgium Europe <tibble [12 × 4]> <tibble [12 × 4]>
#> # … with 132 more rows
Created on 2020-06-03 by the reprex package (v0.3.0)
When using map
with custom functions to loop over tibbles
in a list-column it is quite helpful to build a first version outside of the loop.
test_dat <- gapminder %>%
nest_by(country, continent)
test_dat$data[[1]] %>%
mutate(across(
c(year, lifeExp, pop),
~ sample_function(test_dat$data[[1]], var_a = .x, var_b = gdpPercap)
)
)
Once this works, then the final step is to replace the object you want to loop over with .x
.
Another approach (part of original answer)
Another approach would be to rewrite your original sample_function
and include across
in your mutate
call there. We could make it take a string vector of variable names that will be passed to across
. I might like this approach more, because its more flexible. Now you could have another list-column containing different variable names for different subsets of your data and loop over them and your data column with map2
.
library("tidyverse")
data("gapminder", package = "gapminder")
sample_function2 <- function(.data, .vars, var_b) {
.vars <- syms(.vars)
var_b <- enquo(var_b)
.data %>%
mutate(across(c(!!!.vars), function(y) log(y) + !!var_b))
}
gapminder %>%
group_by(country, continent) %>%
nest() %>%
mutate(sample_res = map(
data,
~ sample_function2(.x,
.vars = c("year", "lifeExp", "pop"),
var_b = gdpPercap)
)
)
#> # A tibble: 142 x 4
#> # Groups: country, continent [142]
#> country continent data sample_res
#> <fct> <fct> <list> <list>
#> 1 Afghanistan Asia <tibble [12 × 4]> <tibble [12 × 4]>
#> 2 Albania Europe <tibble [12 × 4]> <tibble [12 × 4]>
#> 3 Algeria Africa <tibble [12 × 4]> <tibble [12 × 4]>
#> 4 Angola Africa <tibble [12 × 4]> <tibble [12 × 4]>
#> 5 Argentina Americas <tibble [12 × 4]> <tibble [12 × 4]>
#> 6 Australia Oceania <tibble [12 × 4]> <tibble [12 × 4]>
#> 7 Austria Europe <tibble [12 × 4]> <tibble [12 × 4]>
#> 8 Bahrain Asia <tibble [12 × 4]> <tibble [12 × 4]>
#> 9 Bangladesh Asia <tibble [12 × 4]> <tibble [12 × 4]>
#> 10 Belgium Europe <tibble [12 × 4]> <tibble [12 × 4]>
#> # … with 132 more rows
Created on 2020-06-04 by the reprex package (v0.3.0)
Add on (to original answer)
As @Bruno points out that the approaches above are not in the format specified by the OP, here is an alternative solution building on my second approach above, which should yield the desired output.
library("tidyverse")
data("gapminder", package = "gapminder")
sample_function2 <- function(.data, .vars, var_b) {
.vars <- syms(.vars)
var_b <- enquo(var_b)
.data %>%
transmute(across(c(!!!.vars), function(y) log(y) + !!var_b)) %>%
unlist()
}
my_vars <- c("year", "lifeExp", "pop")
gapminder %>%
group_by(country, continent) %>%
nest() %>%
crossing(vars = my_vars) %>%
mutate(sample_res = map2(
data,
vars,
~ sample_function2(.x,
.vars = .y,
var_b = gdpPercap)
)
) %>%
pivot_wider(names_from = vars,
names_prefix = "res_",
values_from = sample_res)
#> # A tibble: 142 x 6
#> country continent data res_lifeExp res_pop res_year
#> <fct> <fct> <list> <list> <list> <list>
#> 1 Afghanistan Asia <tibble [12 × 4]> <dbl [12]> <dbl [12]> <dbl [12]>
#> 2 Albania Europe <tibble [12 × 4]> <dbl [12]> <dbl [12]> <dbl [12]>
#> 3 Algeria Africa <tibble [12 × 4]> <dbl [12]> <dbl [12]> <dbl [12]>
#> 4 Angola Africa <tibble [12 × 4]> <dbl [12]> <dbl [12]> <dbl [12]>
#> 5 Argentina Americas <tibble [12 × 4]> <dbl [12]> <dbl [12]> <dbl [12]>
#> 6 Australia Oceania <tibble [12 × 4]> <dbl [12]> <dbl [12]> <dbl [12]>
#> 7 Austria Europe <tibble [12 × 4]> <dbl [12]> <dbl [12]> <dbl [12]>
#> 8 Bahrain Asia <tibble [12 × 4]> <dbl [12]> <dbl [12]> <dbl [12]>
#> 9 Bangladesh Asia <tibble [12 × 4]> <dbl [12]> <dbl [12]> <dbl [12]>
#> 10 Belgium Europe <tibble [12 × 4]> <dbl [12]> <dbl [12]> <dbl [12]>
#> # … with 132 more rows
Created on 2020-06-04 by the reprex package (v0.3.0)