Wide to long format in R

Question

Below is a snippet of my dataset:

> head(df)
Product   Region  Sector      Type       Date     Value
Product A Capital Primary Continued 2012-01-01      395
Product C Capital Primary Continued 2012-01-01       37
Product D Capital Primary Continued 2012-01-01      208
Product A Central Primary Continued 2012-01-01      343
Product C Central Primary Continued 2012-01-01        1
Product D Central Primary Continued 2012-01-01       80

> tail(df)
   Product   Region  Sector Type      Date    Value
Product C Southern Unknown  New 2014-12-01       11
Product D Southern Unknown  New 2014-12-01       18
Product A  Zealand Unknown  New 2014-12-01       19
Product B  Zealand Unknown  New 2014-12-01       10
Product C  Zealand Unknown  New 2014-12-01        9
Product D  Zealand Unknown  New 2014-12-01        6

I have 12 dates ranging from 2012-01-01 to 2014-12-01 and several factors of the variables. I would like to extrapolate on this dataset, ie. adding some extra random observations following 2014-12-01. My initial thought were to use dcast, e.g.:

dcast(df, Date ~ Product + Region + Type + Sector)

In order to get the combination of all factors. This would result in a dataframe with 12 rows (the dates) and 118 columns (all the combinations of all factors). I could then just add some rows to this dataframe and then convert it back using melt. But this doesn't seem to be a possibility. Are there any other ways to do this?

@Heroka That actually seems like a good solution - don't know why I haven't thought of that. Thanks! — marcopah, Feb 24 '16 at 20:07

Reid Minto · Accepted Answer · 2016-02-26T19:37:49.853

You can just use rbind - just make sure variable names are the same:

df <- data.frame(Product = c("Product A", "Product B", "Product C"), Region = c("Capital", "Capital", "Capital"), 
              Sector = c("Primary", "Primary", "Primary"), Type = c("Continued", "Continued", "Continued"),
              Date = c("2012-01-01", "2013-01-01", "2014-12-01"), Value = c(397, 3, 456))


newdata <- data.frame(Product = c("Product A", "Product B", "Product C"), Region = c("Capital", "Capital", "Capital"), 
                  Sector = c("Primary", "Primary", "Primary"), Type = c("Continued", "Continued", "Continued"),
                  Date = c("2014-12-01", "2014-12-02", "2014-12-03"), Value = c(1, 2, 3))


all(colnames(df) == colnames(newdata))
[1] TRUE


combined <- rbind(df, newdata)

combined

    Product  Region  Sector      Type       Date Value
1 Product A Capital Primary Continued 2012-01-01   397
2 Product B Capital Primary Continued 2013-01-01     3
3 Product C Capital Primary Continued 2014-12-01   456
4 Product A Capital Primary Continued 2014-12-01     1
5 Product B Capital Primary Continued 2014-12-02     2
6 Product C Capital Primary Continued 2014-12-03     3

Wide to long format in R

1 Answers1