-1

I am trying to pivot pairs of key-value variables using tidyr:spread() .

id <- c(0,1,2,3,4,5,6,7,8,9)
key1 <- c("a", "a", "b", "b", "c","a", "a", "b", "b", "c")
val1 <- c(1,2,3,1,2,3,1,2,3,1)
key2 <- c("d",NA,NA,NA,"e","d","d",NA,"b",NA)
val2 <- c(1,NA,NA,NA,2,3,NA,NA,3,NA)
key3 <- c("x",NA,NA,NA,"e","d",NA,NA,NA,NA)
val3 <- c(0,NA,NA,NA,NA,3,1,NA,NA,NA)
df = data.frame(id, key1, val1, key2, val2, key3, val3) 
library(tidyr)
c1 <- spread(df, key1, val1, fill = 0, convert = FALSE)
c2 <- spread(c1, key2, val2, fill = 0, convert = FALSE)
c3 <- spread(c2, key3, val3, fill = 0, convert = FALSE) 

while running the spread(), i get the following error:

Error in [.data.frame(data, setdiff(names(data), c(key_col, value_col))) : undefined columns selected

It makes me think that the problem is in the values and not in the variable names as the error implies, any ideas what to look for?

on the same token, is there a more syntax efficient way to spread multiple pair of key-value variables?

ronencozen
  • 1,991
  • 1
  • 15
  • 26
  • Can you show the dput output of a small example that reproduce the error – akrun Jul 16 '15 at 11:10
  • Please update it in your post. There are some formatting issues with quotes around `"c"(`. – akrun Jul 16 '15 at 12:37
  • 1
    I think the error is because there is an `NA` column created in the step `c2`. When we do the `spread` again with `NA` column, it might have some clash with the names as another NA column is created. You may change the column names before the last step or remove the NA column `spread(c2[1:9], key3, val3, fill=0, convert=FALSE)` – akrun Jul 16 '15 at 12:42
  • it's not clear, why the spread() treat the NA as a column name ? – ronencozen Jul 16 '15 at 12:52
  • 1
    I am not sure about the reason. But you could do this in a loop with `Map` i.e. `do.call(cbind,Map(function(x,y) {x1 <- data.frame(x,y); res <- spread(x1, x,y, fill=0, convert=FALSE); res[!is.na(names(res))] }, df[-1][c(TRUE,FALSE)], df[-1][c(FALSE, TRUE)]))` – akrun Jul 16 '15 at 13:02
  • Very clever ! Thank you – ronencozen Jul 16 '15 at 13:14
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/83437/discussion-between-ronencozen-and-akrun). – ronencozen Jul 16 '15 at 13:24
  • Just an additional note if anyone has a similar issue to what I had. I got this error in a df with no na values. The issue was that my first column had no column name/header. Used `fix()`and spread functions normally now. – AudileF Oct 04 '17 at 11:17

1 Answers1

3

You may use Map

library(tidyr)
res <- do.call(cbind,Map(function(x,y) {x1 <- data.frame(x,y)
        r1 <- spread(x1, x,y, fill=0, convert=FALSE)
        r1[!is.na(names(r1))] }, 
   df[-1][c(TRUE,FALSE)], df[-1][c(FALSE, TRUE)]))

names(res) <- sub('.*\\.', '', names(res))
cbind(df, res) 
akrun
  • 874,273
  • 37
  • 540
  • 662