Reshape2: multiple observations for variable

Question

I have the following sample data:

d <- data.frame(id=c(1,1,1,2,2), time=c(1,1,1,1,1), var=runif(5))
  id time         var
1  1    1 0.373448545
2  1    1 0.007007124
3  1    1 0.840572603
4  2    1 0.684893481
5  2    1 0.822581501

I want to reshape this data.frame to wide format using dcast such that the output is the following:

  id     var.1       var.2     var.3
1  1 0.3734485 0.007007124 0.8405726
2  2 0.6848935 0.822581501        NA

Does anyone has some ideas?

G. Grothendieck · Accepted Answer · 2016-03-25T00:53:50.360

3

Create a sequence column, seq, by id and then use dcast:

library(reshape2)

set.seed(123)
d <- data.frame(id=c(1,1,1,2,2), time=c(1,1,1,1,1), var=runif(5))

d2 <- transform(d, seq = ave(id, id, FUN = seq_along))
dcast(d2, id ~ seq, value.var = "var")

giving:

  id       1       2       3
1  1 0.28758 0.78831 0.40898
2  2 0.88302 0.94047     NaN

edited Mar 25 '16 at 00:53

answered Mar 25 '16 at 00:02

G. Grothendieck

254,981
17
203
341

Interesting - this code works well on the example data. But when I use this code on my original data, the output for row number one is not spread horizontally along the variables (like we want to) but vertically such that it is pasted in row two and three for variable 1. Does anyone know why this is the case? The transformation worked well. – wake_wake Mar 28 '16 at 14:39

score 2 · Answer 2 · answered Mar 25 '16 at 01:21

A dplyr/tidyr option with spread would be

library(dplyr)
library(tidyr)
d %>%
  group_by(id) %>%
  mutate(n1= paste0("var.",row_number())) %>% 
  spread(n1, var) %>%
  select(-time)
#      id     var.1       var.2     var.3
#    (int)     (dbl)       (dbl)     (dbl)
#1     1 0.3734485 0.007007124 0.8405726
#2     2 0.6848935 0.822581501        NA

score 1 · Answer 3 · answered Mar 25 '16 at 00:02

Ok - here's a working solution. The key is to add a counting variable. My solution for this is a bit complicated - maybe you can come up with something better.

library(dplyr)
library(magrittr)
library(reshape2)

d <- data.frame(id=c(1,1,1,2,2,3,3,3,3), time=c(1,1,1,1,1,1,1,1,1), var=runif(9))

group_by(d, id) %>%
  summarise(n = n()) %>%
  data.frame() -> count

f <- c()
for (i in 1:nrow(count)) {
  f <- c(f, 1:count$n[i])
}

d <- data.frame(d, f)

dcast(d, id ~ f, value.var = "var")

Reshape2: multiple observations for variable

3 Answers3