Adding a variable to a list of data.frames using magrittr syntax

Question

Say you have a list of data.frames that already exist in the environment:

library(magrittr)
lapply(
  paste0("z", 2011:2015),
  function(x) assign(
    x, 
    data.frame(x=rnorm(10),y=rnorm(10)),
    pos = 1
  )
)
# should create z2011 through z2015 in your R env

What I would like to do is: extract a column, combine these into one data.frame, then add an additional variable to identify where they came from using magrittr syntax.

I realize this is something trivial using other techniques (namely: ldply(list), rbind.fill(listing), rbind_all(listing), do.call(rbind,...)). The point of my question is to understand approaches using magrittr syntax.

df <- 
   paste0("z",2011:2015) %>%
   lapply(get) %>%
   lapply(function(x) extract2(x,"x")) %>%
   # what would you do next? Another approach you think is
   # more appropriate for magrittr?

I don't know how to add a new variable. For examples sake, I would like to end up with the following:

do.call(
  rbind, 
  lapply(
    paste0("z",2011:2015), 
    function(x) {
      data.frame(x = get(x)$x, year = x)
    }
  )
)

I would do `df <- paste0("z",2011:2015) %>% lapply(get) %>% lapply(function(x) x[["x"]]) %>% as.data.frame() %>% set_names(paste0("z", 2011:2015)) %>% melt` but it isn't very idiomatic — jeremycg, Jul 30 '15 at 00:16

score 0 · Answer 1 · answered Sep 09 '15 at 20:46

I've always thought that you get magrittr-idiomatic approach by taking nested calls and turning them inside out. So, doing that to the last snippet of yours yields

paste0("z", 2011:2015) %>%
  lapply(function(name) data.frame(x = get(name)$x, year = name)) %>% 
  do.call(rbind, .)

which looks fine to me. I'm not a huge fan of breaking every possible statement down to x %>% foo1 %>% foo2 %>% ..., and in this situation this is additionally justified: otherwise you'll have to repeat paste0 call again to reconstruct variable names (as proposed in the comment).

score 0 · Answer 2 · answered Apr 10 '18 at 07:52

data

First I'll make your example a bit shorter for better readability.

# creates data.frames z2011, z2012 and z2013, 2 lines each
lapply(
  paste0("z", 2011:2013),
  function(x) assign(
    x, 
    data.frame(x=rnorm(2),y=rnorm(2)),
    pos = 1
  )
)

magrittr + base solution

You should never use lapply(get(x)), use mget instead. And you should use extract and not extract2 in your lapply as you wish to keep a data.frame.

Then the idiomatic magrittr way of assigning a column is to use inset or inset2 (same effect here)

So you get :

mget(paste0("z",2011:2015)) %>%
  lapply(extract,"x") %>%
  Map(inset,.,"year",value = names(.)) %>%
  do.call(rbind,.)

#                   x  year
# z2011.1 -0.62124058 z2011
# z2011.2 -2.21469989 z2011
# z2012.1 -0.01619026 z2012
# z2012.2  0.94383621 z2012
# z2013.1  0.91897737 z2013
# z2013.2  0.78213630 z2013

using purrr

magrittr is often used with tidyverse, using only purrr::map_dfr you could write:

library(purrr)
mget(paste0("z",2011:2013)) %>%
  map_dfr(~.["x"],.id="year")

#    year           x
# 1 z2011 -0.62124058
# 2 z2011 -2.21469989
# 3 z2012 -0.01619026
# 4 z2012  0.94383621
# 5 z2013  0.91897737
# 6 z2013  0.78213630

Adding a variable to a list of data.frames using magrittr syntax

2 Answers2