1

I have a bunch of data frames stored in a single list. My goal is to format each data frame in the list such that values in a specific column turn into column names. Since I would like every data frame in the list to be transformed, I tried to apply the spread function in tidyverse over all elements in the list. However, I am receiving the following error:

the condition has length > 1 and only the first element will be usedError:  `var` must evaluate to a single number or a column name, not a double vector

Here's a dummy example I borrowed from How to control new variables' names after tidyr's spread? to facilitate the discussion:

Create dummy data frames:

df1 <- data.frame(
    id = rep(1:3, rep(2,3)), 
    year = rep(c(2012, 2013), 3), 
    value = runif(6)
)

df2 <- data.frame(
    id = rep(4:6, rep(2,3)), 
    year = rep(c(2012, 2013), 3), 
    value = runif(6)
)

Store data frames in a list:

list <- list(df1, df2)

list[[1]]
#  id year      value
#1  1 2012 0.09668064
#2  1 2013 0.62739399
#3  2 2012 0.45618433
#4  2 2013 0.60347152
#5  3 2012 0.84537624
#6  3 2013 0.33466030

Desired outcome for list[[1]]:

#  id       2012      2013
#1  1 0.09668064 0.6273940
#2  2 0.45618433 0.6034715
#3  3 0.84537624 0.3346603

My attempt at spreading keys/values over all data frames stored as elements in a list:

library(tidyverse)
for (i in 1:2){
  list[[i]] %>% spread(key = list[[i]][,2], value = list[[i]][,3])
}
mochi
  • 55
  • 5
  • 1
    `library(tidyr); lapply(list, function(x) spread(x, year, value))` I assume you do not insist on the for-loop. – markus Feb 07 '18 at 23:41
  • @markus Thank you! Is there a way to reference columns using indices instead of their names? In my actual data set `value` in `spread(x, year, value)` varies across data frames. – mochi Feb 08 '18 at 00:12
  • mochi, glad I could help. Please consider to accept @akrun's answer if it solved your problem. – markus Feb 09 '18 at 10:50

1 Answers1

0

It is better not to use index for key/value as any change in column order would create wrong resultw, but if the positions are known, then

library(tidyverse)
res <- map(list, ~ .x %>% 
                     spread(key = 2, value = 3))

compare with the key/value passed as column names. We would recommend to use the names

resOld <- map(list, ~ .x %>% 
                        spread(key = year, value = value))
identical(res, resOld)
#[1] TRUE
akrun
  • 874,273
  • 37
  • 540
  • 662