use pmap() to calculate row means of several columns

Question

I'm trying to better understand how pmap() works within dataframes, and I get a surprising result when applying pmap() to compute means from several columns.

mtcars %>% 
  mutate(comp_var = pmap_dbl(list(vs, am, cyl), mean)) %>% 
  select(comp_var, vs, am, cyl)

In the above example, comp_var is equal to the value of vs in its row, rather than the mean of the three variables in a given row.

I know that I could get accurate results for comp_var using ...

mtcars %>% 
  rowwise() %>% 
    mutate(comp_var = mean(c(vs, am, cyl))) %>% 
    select(comp_var, vs, am, cyl) %>% 
  ungroup()

... but I want to understand how pmap() should be applied in a case like this.

Hint: what does `mean(1,2,3)` return? – MrFlick May 08 '18 at 19:01 — MrFlick, May 08 '18 at 19:01

akrun · Accepted Answer · 2018-05-08T19:08:06.733

We need to concatenate the argument for the x parameter in mean as

x: An R object. Currently there are methods for numeric/logical vectors and date, date-time and time interval objects. Complex vectors are allowed for ‘trim = 0’, only.

So, if we pass argument like x1, x2, x3, etc, it will be going into the ... parameter based on the usage

mean(x, ...)

For e.g.

mean(5, 8) # x is 5
#[1] 5 
mean(8, 5) # x is 8
#[1] 8
mean(c(5, 8)) # x is a vector with 2 values
#[1] 6.5

In the rowwise function, the OP concatenated the elements to a single vector while with pmap it is left as such for mean to apply on the first argument

out1 <- mtcars %>% 
         mutate(comp_var = pmap_dbl(list(vs, am, cyl), ~mean(c(...)))) %>% 
         dplyr::select(comp_var, vs, am, cyl)

-checking with the rowwise output

out2 <- mtcars %>% 
         rowwise() %>% 
         mutate(comp_var = mean(c(vs, am, cyl))) %>% 
         dplyr::select(comp_var, vs, am, cyl) %>% 
         ungroup()

all.equal(out1, out2, check.attributes = FALSE)
#[1] TRUE

use pmap() to calculate row means of several columns

1 Answers1

Linked