5

Suppose I've got the following data frame :

d <- data.frame(id=c(1,1,1,2,2,3,3,3), time=c(1,2,3,1,2,1,2,3), var=runif(8))

 d
  id time       var
1  1    1 0.3733586
2  1    2 0.5743769
3  1    3 0.8253280
4  2    1 0.8136957
5  2    2 0.8726963
6  3    1 0.1105549
7  3    2 0.9527002
8  3    3 0.5690021

With the base reshape function, I can transform it to a "wide" format by specifying a ìdvar (which identifies rows belonging to the same unit) and a timevar (which identifies different observations of the same unit) :

reshape(d, idvar="id", timevar="time", direction="wide")

  id     var.1     var.2     var.3
1  1 0.3733586 0.5743769 0.8253280
4  2 0.8136957 0.8726963        NA
6  3 0.1105549 0.9527002 0.5690021

I've tried to do it with the dcast function of reshape2, but didn't find a way. Do you know if it is possible ?

EDIT : Ananda Mahto's comment and answer are perfectly right, the real question was to cast the original data frame when it has several var columns. My example was not appropriate, sorry.

juba
  • 47,631
  • 14
  • 113
  • 118
  • 2
    I suspect your actual question is different from my answer, and is based on your work on [this question](http://stackoverflow.com/q/14749237/1270695) where you have more than one possible `value.var`. Can you update your question to reflect that? – A5C1D2H2I1M1N2O1R2T1 Feb 07 '13 at 11:33
  • @AnandaMahto You're perfectly right ! – juba Feb 07 '13 at 12:44

1 Answers1

9

Doesn't the following work?

dcast(d, id ~ time)
# Using var as value column: use value.var to override.
#   id         1          2         3
# 1  1 0.2869739 0.59591690 0.8989719
# 2  2 0.4533770 0.14741778        NA
# 3  3 0.1286770 0.02465634 0.7363114

## OR, to get rid of the message:
## dcast(d, id ~ time, value.var = "var")

I suspect, though, that you're asking a little bit different question (as mentioned in my comment). In particular, what if you were starting with:

set.seed(1)
d <- data.frame(id = c(1,1,1,2,2,3,3,3), 
                time = c(1,2,3,1,2,1,2,3), 
                var1 = runif(8),
                var2 = runif(8))

With base R's reshape, it's just one line:

reshape(d, direction = "wide", idvar = "id", timevar = "time")
#   id    var1.1    var2.1    var1.2     var2.2    var1.3    var2.3
# 1  1 0.2655087 0.6291140 0.3721239 0.06178627 0.5728534 0.2059746
# 4  2 0.9082078 0.1765568 0.2016819 0.68702285        NA        NA
# 6  3 0.8983897 0.3841037 0.9446753 0.76984142 0.6607978 0.4976992

Let's try the same with dcast from "reshape2". Here's the approach we might be tempted to take:

library(reshape2)
dcast(d, id ~ time)
# Using var2 as value column: use value.var to override.
#   id         1          2         3
# 1  1 0.6291140 0.06178627 0.2059746
# 2  2 0.1765568 0.68702285        NA
# 3  3 0.3841037 0.76984142 0.4976992

But that doesn't work because dcast expects a single value.var. So, we need to melt the data again.

d2 <- melt(d, id.vars = c("id", "time"))
head(d2)
#   id time variable     value
# 1  1    1     var1 0.2655087
# 2  1    2     var1 0.3721239
# 3  1    3     var1 0.5728534
# 4  2    1     var1 0.9082078
# 5  2    2     var1 0.2016819
# 6  3    1     var1 0.8983897

Now, we can use dcast quite easily.

dcast(d2, id ~ variable + time)
#   id    var1_1    var1_2    var1_3    var2_1     var2_2    var2_3
# 1  1 0.2655087 0.3721239 0.5728534 0.6291140 0.06178627 0.2059746
# 2  2 0.9082078 0.2016819        NA 0.1765568 0.68702285        NA
# 3  3 0.8983897 0.9446753 0.6607978 0.3841037 0.76984142 0.4976992
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485