reshape wide into long while splitting

Question

I am looking for reshaping:

  ID p2012 p2010 p2008 p2006 c2012 c2010 c2008 c2006
1  1   160   162   163   165  37.3  37.3  37.1  37.1
2  2   163   164   164   163   2.6   2.6   2.6   2.6

into:

    ID  year    p   c
1   1   2006    165 37.1
2   1   2008    164 37.1
3   1   2010    162 37.3
4   1   2012    160 37.3
5   2   2006    163 2.6
6   2   2008    163 2.6
7   2   2010    164 2.6
8   2   2012    163 2.6

I am new to R, have been trying around with melt and dcast functions, but there are just to many twists for me at this stage. Help would be much appreciated!

A dput of my df:

structure(list(ID = 1:2, p2012 = c(160L, 163L), p2010 = c(162L, 164L), p2008 = 163:164, p2006 = c(165L, 163L), c2012 = c(37.3, 2.6), c2010 = c(37.3, 2.6), c2008 = c(37.1, 2.6), c2006 = c(37.1, 2.6)), .Names = c("ID", "p2012", "p2010", "p2008", "p2006", "c2012", "c2010", "c2008", "c2006"), class = "data.frame", row.names = c(NA, -2L))

Welcome to SO! Maybe you can make it a bit simpler for us to answer your question: Using `dput(yourDF)` R gives you code you can dump here that allows us to reconstruct your data within seconds within our installation. — Thilo, May 22 '14 at 07:37

score 4 · Accepted Answer · answered May 22 '14 at 07:56

4

An alternative to shadow's answer is to use the reshape function:

reshape(d, direction='long', varying=list(2:5, 6:9), v.names=c("p", "c"), idvar="ID", times=c(2012, 2010, 2008, 2006))

This assumes that you know the column indices of the p and c beforehand (or add additional code to figure them out). Furthermore, the times vector above could be found by using something similar to the gsub function of shadow.

Which way to use probably is a matter of taste.

answered May 22 '14 at 07:56

Thilo

8,827
2
35
56

Thanks, you guys are marvelous! Both ways work for me. – nils May 22 '14 at 08:01
1

In case you don't know the right indices , `list(grep('p',colnames(d)),grep('c',colnames(d)))` – agstudy May 22 '14 at 08:02
thanks for the specification, I was able to `reshape` my original data.frame! – nils May 22 '14 at 08:44

score 2 · Answer 2 · answered May 22 '14 at 07:49

You probably have to melt the data first, then split the variable and the year and then dcast to your final data.frame.

require(reshape2)
# melt data.frame
dfmelt <- melt(df, id.vars="ID", variable.name="var.year")
# split "var.year" into new variables "var" and "year" 
dfmelt[, "var"] <- gsub("[0-9]", "", as.character(dfmelt[, "var.year"]))
dfmelt[, "year"] <- as.numeric(gsub("[a-z, A-Z]", "", as.character(dfmelt[, "var.year"])))
# cast to data with column for each var-name
dcast(dfmelt, ID+year~var, value.var="value")

score 2 · Answer 3 · answered Jun 21 '14 at 13:48

You can also use the following solution from tidyr. You don't actually need to use regular expressions, if "p" or "c" is always the first letter of the column names:

library(tidyr)
library(dplyr) # only loaded for the %>% operator

dat %>%
  gather(key,value,p2012:c2006) %>%
  separate(key,c("category","year"),1) %>%
  spread(category,value)
  ID year    c   p
1  1 2006 37.1 165
2  1 2008 37.1 163
3  1 2010 37.3 162
4  1 2012 37.3 160
5  2 2006  2.6 163
6  2 2008  2.6 164
7  2 2010  2.6 164
8  2 2012  2.6 163

yep, this also works. thx. `tidyr` & `dplyr` are nice, just used it for other data. — nils, Dec 11 '15 at 17:45

reshape wide into long while splitting

3 Answers3