0

I would like to use the map function with the tidyverse to create a column of data frames based on arguments from some, but not all, of the columns of the original data frame/tibble.

I would prefer to be able to use the map function so that I can replace this with future_map to utilize parallel computing.

With the exception of this solution not using map, this solution produces the correct end result (see also this question and answer: How to use rowwise to create a list column based on a function):

library(tidyverse)
library(purrr)

df <- data.frame(a= c(1,2,3), b=c(2,3,4), c=c(6,5,8))

fun <- function(q,y) {
    r <- data.frame(col1 = c(q+y, q, q, y), col2 = c(q,q,q,y))
    r
}

result1 <- df %>% rowwise(a) %>% mutate(list1 = list(fun(a, b)))

> result1
# A tibble: 3 × 4
# Rowwise:  a
      a     b     c list1       
  <dbl> <dbl> <dbl> <list>      
1     1     2     6 <df [4 × 2]>
2     2     3     5 <df [4 × 2]>
3     3     4     8 <df [4 × 2]>

How can I instead do this with map? Here are three incorrect attempts:

Incorrect attempt 1:

wrong1 <- df %>% mutate(list1 = map(list(a,b), fun))

Incorrect attempt 2:

wrong2 <- df %>% mutate(list1 = map(c(a,b), fun))

Incorrect attempt 2:

wrong3 <- df %>% mutate(list1 = list(map(list(a,b), fun)))

The error I get is x argument "y" is missing, with no default. And I am not sure how to pass multiple arguments into a situation like this.

I would like a solution with multiple arguments, but if that is not possible, let's move to a function with one argument.

fun_one_arg <- function(q) {
    r <- data.frame(col1 = c(q, q, q, q+q), col2 = c(3*q,q,q,q/2))
    r
}

wrong4 <- df %>% mutate(list1 = map(a, fun_one_arg))
wrong5 <- df %>% mutate(list1 = list(map(a, fun_one_arg)))

These run, but the fourth columns are not data frames, as I would have expected.

bill999
  • 2,147
  • 8
  • 51
  • 103

1 Answers1

1

We can use map2 as there are two arguments

library(dplyr)
df %>%
    mutate(list1 = map2(a, b, fun)) %>%
    as_tibble
# A tibble: 3 x 4
      a     b     c list1       
  <dbl> <dbl> <dbl> <list>      
1     1     2     6 <df [4 × 2]>
2     2     3     5 <df [4 × 2]>
3     3     4     8 <df [4 × 2]>

Or another option is pmap which can take more than 2 columns as well. The ..1, ..2 represents the columns in the same order

df %>%
    mutate(list1 = pmap(across(c(a, b)), ~ fun(..1, ..2))) %>%
    as_tibble
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 2
    Surely you have answered a duplicate question in the past. – IRTFM Sep 16 '21 at 19:51
  • Thanks! A couple questions, if you don't mind: For the first solution, why does `list` show up as (for the first row) `3, 1, 1, 2, 1, 1, 1, 2` instead of `` as in the `rowwise` solution? And for the second solution, I am getting the error: `Error: Problem with `mutate()` column `list1`. ℹ `list1 = pmap(across(c(a, b)), fun)`. x unused arguments (a = .l[[1]][[i]], b = .l[[2]][[i]])` – bill999 Sep 16 '21 at 19:56
  • @bill999 `df` input is a data.frame and thus it returns a data.frame. In `rowwise`, it adds the attributes and create the `tibble` – akrun Sep 16 '21 at 20:01
  • I updated the post. Now both of them gives the same output – akrun Sep 16 '21 at 20:01