14

I'm having trouble stacking columns in a data.frame into one column. Now my data looks something like this:

id   time    black   white   red 
a     1       b1      w1     r1
a     2       b2      w2     r2
a     3       b3      w3     r3
b     1       b4      w4     r4
b     2       b5      w5     r5
b     3       b6      w6     r6

I'm trying to transform the data.frame so that it looks like this:

id   time  colour 
a     1     b1
a     2     b2
a     3     b3
b     1     b4
b     2     b5
b     3     b6
a     1     w1
a     2     w2
a     3     w3
b     1     w4
b     2     w5
b     3     w6
a     1     r1
a     2     r2
.     .     .
.     .     .
.     .     .

I'm guessing that this problem requires using the reshape package, but I'm not exactly sure how to use it to stack multiple columns under one column. Can anyone provide help on this?

econlearner
  • 351
  • 2
  • 3
  • 10
  • Have you looked at the help for `melt` in the `reshape` (or more uptodate `reshape2` package. – mnel Nov 28 '12 at 03:06

2 Answers2

13

Here's melt from reshape:

library(reshape)
melt(x, id.vars=c('id', 'time'),var='color')

And using reshape2 (an up-to-date, faster version of reshape) the syntax is almost identical.

The help files have useful examples (see ?melt and the link to melt.data.frame).

In your case, something like the following will work (assuming your data.frame is called DF)

library(reshape2)
melt(DF, id.var = c('id','time'), variable.name = 'colour')
mnel
  • 113,303
  • 27
  • 265
  • 254
Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112
  • Related question: is there a way to use melt.data.frame to stack all columns of a data.frame? Does setting id.vars=c() work? – W7GVR Feb 09 '15 at 18:43
  • @gvrocha Indeed it does. Note that supplying `c()` is different from not supplying the option. See the "Details" section of `?melt`. – Matthew Lundberg Feb 09 '15 at 18:59
10

Since you mention "stacking" in your title, you can also look at the stack function in base R:

cbind(mydf[1:2], stack(mydf[3:5]))
#    id time values   ind
# 1   a    1     b1 black
# 2   a    2     b2 black
# 3   a    3     b3 black
# 4   b    1     b4 black
# 5   b    2     b5 black
# 6   b    3     b6 black
# 7   a    1     w1 white
# 8   a    2     w2 white
# 9   a    3     w3 white
# 10  b    1     w4 white
# 11  b    2     w5 white
# 12  b    3     w6 white
# 13  a    1     r1   red
# 14  a    2     r2   red
# 15  a    3     r3   red
# 16  b    1     r4   red
# 17  b    2     r5   red
# 18  b    3     r6   red

If the values in the "black", "white", and "red" columns are factors, you'll need to convert them to character values first.

cbind(mydf[1:2], stack(lapply(mydf[3:5], as.character)))
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485