1

I want to add individual elements of a char vector as columns to a list of data.frames. I can do it manually, but is there a more elegant lapply-way?

# Create sample dfs 
set.seed(1)
df1 <- data.frame("one" = sample(1:10,10,replace=TRUE),
                  "two" = sample(1:10,10,replace=TRUE))

df2 <- data.frame("three" = sample(1:10,10,replace=TRUE),
                  "four" = sample(1:10,10,replace=TRUE))

df3 <- data.frame("five" = sample(1:10,10,replace=TRUE),
                  "six" = sample(1:10,10,replace=TRUE))

# Combine them to list 
dflist = list("df1"=df1,"df2"=df2,"df3"=df3)

# add labelling column 
dflist$df1["label"] <- "a"
dflist$df2["label"] <- "b"
dflist$df3["label"] <- "c"

# With lapply I can only add either "a", "b" or "c" 
dflist = list("df1"=df1,"df2"=df2,"df3"=df3)
labvec <- c("a","b","c")
lapply(dflist,function(x) cbind(x,labvec[2])) # I have to select 1, 2 or 3.

Asked differently: Could i also index over "labvec" with lapply?

markus
  • 25,843
  • 5
  • 39
  • 58
Krisselack
  • 503
  • 3
  • 16

3 Answers3

3

You could use Map

Map(`[<-`, x = dflist, i = "label", value = labvec)
#$df1
#  one two label
#1   1   3     a
#2   2   1     a
#3   2   3     a

#$df2
#  three four label
#1     3    1     b
#2     2    1     b
#3     2    1     b

#$df3
#  five six label
#1    3   2     c
#2    2   3     c
#3    3   3     c

x, i and value are arguments of the function `[<-`, that we usually not name as in iris['Species2'] <- "a_string_column", where

  • x : iris
  • i : 'Species2'
  • value : "a_string_column"

The same idea as above but here we use an anonymous function with three arguments (might be easier to read):

Map(function(data, label, value) {data[label] <- value; data}, 
     data = dflist,
     label = "label",
     value = labvec) 

data

set.seed(1)
df1 <- data.frame("one" = sample(3,replace=TRUE),
                  "two" = sample(3,replace=TRUE))

df2 <- data.frame("three" = sample(3,replace=TRUE),
                  "four" = sample(3,replace=TRUE))

df3 <- data.frame("five" = sample(3,replace=TRUE),
                  "six" = sample(3,replace=TRUE))

# Combine them to list 
dflist = list("df1"=df1,"df2"=df2,"df3"=df3)
markus
  • 25,843
  • 5
  • 39
  • 58
  • Thank you, it works, but i don't understand why. Unfortunately, the docs of Map() are not helpful to me at all. – Krisselack Jan 25 '19 at 14:08
  • 1
    @Krisselack Glad it worked and yes, the docs could be better, but you'll find more infos here `?mapply`. `Map` applies a function, here `"[<-"'` in parallel over the arguments that you specify via `...`. A simpler example (though not very useful, would be `Map("-", c(10, 5, 3), c(5, 3, 1))`. From `10` we subtract `5`.from `5` `3` and from `3` we subtract `1`. `Map` returns a list. – markus Jan 25 '19 at 14:13
  • This example I can easily follow, but your (easier to read) notation was hard to read for me. You first define the function: there you create data$label and assign value to it and return (the new) data. ...and in your first solution you map the function "[<-", which I not even knew that it existed, and do not even know how to access its docs... after several years of R usage! – Krisselack Jan 25 '19 at 14:25
  • 1
    "... which I not even knew that it existed" - that's why I thought the second notation would be easier to follow. You can read the docs via `help("[<-")`. The usual way you use this function is `my_data["column"] <- value`. Sometimes this "function noation" is handy when you don't want to use an anonymous. Same example as in my previous comment but this time using an anonymous function: `Map(function(x, y) x - y, x = c(10, 5, 3), y = c(5, 3, 1))` – markus Jan 25 '19 at 14:32
  • @Krisselack You mind find this helpful: http://adv-r.had.co.nz/Functions.html#all-calls – markus Jan 25 '19 at 14:40
1

A solution with lapply() and the use of dplyr::mutate().

library(dplyr)

dflist <- lapply(1:length(dflist), function(i) {
  dflist[[i]] %>% 
    mutate(label = letters[i])
})
# lapply(dflist, head, 2)
# [[1]]
#   one two label
# 1   3   3     a
# 2   4   2     a
# 
# [[2]]
#   three four label
# 1    10    5     b
# 2     3    6     b
# 
# [[3]]
#   five six label
# 1    9   5     c
# 2    7   9     c

Note that this is just "forcing" the lapply(), I mean, it's basically a for loop not that well hidden.

RLave
  • 8,144
  • 3
  • 21
  • 37
1

Using tidyverse with map2

library(tidyverse)
map2(dflist, labvec, ~ .x %>% 
                      mutate(label = .y))
#$df1
#   one two label
#1    3   3     a
#2    4   2     a
#3    6   7     a
#4   10   4     a
#5    3   8     a
#6    9   5     a
#7   10   8     a
#8    7  10     a
#9    7   4     a
#10   1   8     a

#$df2
#   three four label
#1     10    5     b
#2      3    6     b
#3      7    5     b
#4      2    2     b
#5      3    9     b
#6      4    7     b
#7      1    8     b
#8      4    2     b
#9      9    8     b
#10     4    5     b

#$df3
#   five six label
#1     9   5     c
#2     7   9     c
#3     8   5     c
#4     6   3     c
#5     6   1     c
#6     8   1     c
#7     1   4     c
#8     5   6     c
#9     8   7     c
#10    7   5     c
akrun
  • 874,273
  • 37
  • 540
  • 662