Turn wide to long dataframe with R reshape function, with three levels of duplicates

Question

I am struggling with R reshape function all the time and still would like to use it consistently as it is R-core. I stumbled onto this problem having wide dataframe like this

dw <- read.table(header=T, text='
 sbj A.f1.avg A.f1.sd A.f2.avg A.f2.sd B.f1.avg B.f1.sd B.f2.avg B.f2.sd
   A       10       6       50      10       10       6       50      10  
   B       12       5       70      11       12       5       70      11  
   C       20       7       20      8        20       7       20      8   
   D       22       8       22      9        22       8       22      9   
 ')

and I wish for this outcome:

sbj AB  f   var res
A   A   f1  avg 10
B   A   f1  avg 12
C   A   f1  avg 20
D   A   f1  avg 22
A   A   f1  sd  6
B   A   f1  sd  5
C   A   f1  sd  7
D   A   f1  sd  8
A   A   f2  avg 50
B   A   f2  avg 70
C   A   f2  avg 20
D   A   f2  avg 22
A   A   f2  sd  10
B   A   f2  sd  11
C   A   f2  sd  8
D   A   f2  sd  9
A   B   f1  avg 10
B   B   f1  avg 12
C   B   f1  avg 20
D   B   f1  avg 22
A   B   f1  sd  6
B   B   f1  sd  5
C   B   f1  sd  7
D   B   f1  sd  8
A   B   f2  avg 50
B   B   f2  avg 70
C   B   f2  avg 20
D   B   f2  avg 22
A   B   f2  sd  10
B   B   f2  sd  11
C   B   f2  sd  8
D   B   f2  sd  9

Please, help!

score 2 · Answer 1 · answered Oct 03 '17 at 16:27

2

With dplyr and tidyr:

library(dplyr)
library(tidyr)

X <- gather(dw, var, res, -sbj) %>%
    separate(var, into = c("AB", "f", "var"))

Result:

> head(X)
  sbj AB  f var res
1   A  A f1 avg  10
2   B  A f1 avg  12
3   C  A f1 avg  20
4   D  A f1 avg  22
5   A  A f1  sd   6
6   B  A f1  sd   5

answered Oct 03 '17 at 16:27

ulfelder

5,305
1
22
40

1

Thanks! So it's impossible/harder with reshape()? – striatum Oct 03 '17 at 16:32

score 2 · Accepted Answer · answered Oct 03 '17 at 16:36

Using reshape2...

library(reshape2)
dw2 <- melt(dw, id.vars="sbj", value.name="res") #melt to long format

#create new variables by splitting column at dots
dw2[, c("AB", "f", "var")] <- t(as.data.frame((strsplit(as.character(dw2$variable),"\\."))))

#reorder variables
dw2 <- dw2[,c("sbj", "AB", "f", "var", "res")]

dw2
   sbj AB  f var   res
1    A  A f1 avg    10
2    B  A f1 avg    12
3    C  A f1 avg    20
4    D  A f1 avg    22
5    A  A f1  sd     6
6    B  A f1  sd     5
7    C  A f1  sd     7
8    D  A f1  sd     8
9    A  A f2 avg    50
10   B  A f2 avg    70
11   C  A f2 avg    20
12   D  A f2 avg    22
13   A  A f2  sd    10
14   B  A f2  sd    11
15   C  A f2  sd     8
16   D  A f2  sd     9
17   A  B f1 avg    10
18   B  B f1 avg    12
19   C  B f1 avg    20
20   D  B f1 avg    22
21   A  B f1  sd     6
22   B  B f1  sd     5
23   C  B f1  sd     7
24   D  B f1  sd     8
25   A  B f2 avg    50
26   B  B f2 avg    70
27   C  B f2 avg    20
28   D  B f2 avg    22
29   A  B f2  sd    10
30   B  B f2  sd    11
31   C  B f2  sd     8
32   D  B f2  sd     9

lmo · Answer 3 · 2017-10-03T17:24:14.453

Here is a base R method that uses stack and cbind for the initial "reshape long" and then performs a second step to decompose the variable names of dw into separate columns.

# get a 3 column df
dat <- cbind(dw[1], stack(dw[-1]))

# split the variable names, reorder columns, and provide desired names
dat <- setNames(cbind(dat[1],
                      do.call(rbind, strsplit(as.character(dat$ind), ".", fixed=TRUE)),
                      dat[2]),
                 c("sbj", "AB", "f", "var", "res"))

Turn wide to long dataframe with R reshape function, with three levels of duplicates

3 Answers3