0

Data:

  set.seed(0)
    date <- rep(1:4,3)
    N <- length(date)
    A <- rnorm(N)
    B <- rnorm(N)
    C <- rnorm(N)
    mydata <- data.frame(date, A, B, C)

   date            A          B           C
1     1  1.262954285 -1.1476570 -0.05710677
2     2 -0.326233361 -0.2894616  0.50360797
3     3  1.329799263 -0.2992151  1.08576936
4     4  1.272429321 -0.4115108 -0.69095384
5     1  0.414641434  0.2522234 -1.28459935
6     2 -1.539950042 -0.8919211  0.04672617
7     3 -0.928567035  0.4356833 -0.23570656
8     4 -0.294720447 -1.2375384 -0.54288826
9     1 -0.005767173 -0.2242679 -0.43331032
10    2  2.404653389  0.3773956 -0.64947165

and this is what I am trying to achieve:

date name   value
1    A    1.262954285 
1    B   -1.1476570
1    C   -0.05710677
2    A   -0.326233361
2    B   -0.2894616
2    C    0.50360797
... ...   ...

I believe that I am supposed to use melt(), but I get something a bit different:

> M <-  melt(mydata,id.vars = "date")
> head(M)
  date   variable    value
1    1      A    1.2629543
2    2      A   -0.3262334
3    3      A    1.3297993
4    4      A    1.2724293
5    1      A    0.4146414
6    2      A   -1.5399500

Can I tweak melt() somehow to get it right?

Per
  • 577
  • 4
  • 7
  • 1
    Looks like melt is what you want. The result is just sorted by `variable`, not `date`, as it appears to be in what you're trying to achieve. – Matthew Plourde May 26 '15 at 13:34
  • `M[order(M$date), ]` is ok? If not, then you might need to add a grouping variable to *mydata* with something like `rep(1:k, each = 4)` – Pafnucy May 26 '15 at 13:36
  • Yes, I see that. I thought the sorting was fixed by the id.vars, and I put "date" there. How can I sort it by date? Sorry if I ask the obvious.. – Per May 26 '15 at 13:37

1 Answers1

1
library(dplyr)
set.seed(0)
date <- rep(1:4,3)
N <- length(date)
A <- rnorm(N)
B <- rnorm(N)
C <- rnorm(N)
mydata <- data.frame(date, A, B, C)
long <- melt(mydata,id="date")
sorted <- arrange(long,date,variable)

You can sort by the date by using dplyr's arrange (you can also sort using base R's order, but the syntax there is cumbersome compared to arrange).

However, it doesn't look like what you want because you've got three sets of rows for every date/group combination. Instead it looks like:

   date variable        value
1     1        A  1.262954285
2     1        A  0.414641434
3     1        A -0.005767173
4     1        B -1.147657009
5     1        B  0.252223448
6     1        B -0.224267885
7     1        C -0.057106774
8     1        C -1.284599354
9     1        C -0.433310317
10    2        A -0.326233361
11    2        A -1.539950042
12    2        A  2.404653389
13    2        B -0.289461574
14    2        B -0.891921127
15    2        B  0.377395646
16    2        C  0.503607972
17    2        C  0.046726172
18    2        C -0.649471647
19    3        A  1.329799263
20    3        A -0.928567035
21    3        A  0.763593461
22    3        B -0.299215118
23    3        B  0.435683299
24    3        B  0.133336361
25    3        C  1.085769362
26    3        C -0.235706556
27    3        C  0.726750747
28    4        A  1.272429321
29    4        A -0.294720447
30    4        A -0.799009249
31    4        B -0.411510833
32    4        B -1.237538422
33    4        B  0.804189510
34    4        C -0.690953840
35    4        C -0.542888255
36    4        C  1.151911754
jrdnmdhl
  • 1,935
  • 18
  • 26