4

Here is my original output.

BizDate     Description                             TotalAmount TotalQty
2013-09-01 NIN JIOM COUGH SYRUP 75ML                11.8        2
2013-09-01 BREACOL COUGH SYRUP ADULTS 120ML         15.8        2
2013-09-02 AFRICAN SEA-COCONUT COUGH SYRUP 177ML    8.5         1
2013-09-03 AFRICAN SEA COCONUT COUGH SYRUP 177ML    8.2         1
2013-09-01 THREE LEGS COOLING WATER 200ML           21.0        14
2013-09-01 SEAHORSE BRAND COOLING WATER 200ML       4.4         4
2013-09-05 AFRICAN SEA-COCONUT COUGH SYRUP 177ML    8.5         1
2013-09-06 AFRICAN SEA COCONUT COUGH SYRUP 177ML    8.2         1

How would I transform it into something like this

 Description                              2013-09-01   2013-09-02 .......
 NIN JIOM COUGH SYRUP 75ML                some number some number .......
 BREACOL COUGH SYRUP ADULTS 120ML         some number some number .......
 AFRICAN SEA-COCONUT COUGH SYRUP 177ML    some number some number .......
 AFRICAN SEA COCONUT COUGH SYRUP 177ML    some number some number .......
 THREE LEGS COOLING WATER 200ML           some number some number .......
 SEAHORSE BRAND COOLING WATER 200ML       some number some number .......
 AFRICAN SEA-COCONUT COUGH SYRUP 177ML    some number some number .......
 AFRICAN SEA COCONUT COUGH SYRUP 177ML    some number some number .......

Here is some fake data.

set.seed(1)
dat <- data.frame(product = rep(letters[1:4], each = 4)
    , date = sample(seq(as.Date("2013-01-01"), as.Date("2013-01-10"), by = 1), 16, TRUE)
    , amount = sample(1:100, 16, TRUE)
    , qty = sample(1:4, 16, TRUE))

dat
   product       date amount qty
1        a 2013-01-03     72   2
2        a 2013-01-04    100   1
3        a 2013-01-06     39   4
4        a 2013-01-10     78   3
5        b 2013-01-03     94   4
6        b 2013-01-09     22   1
7        b 2013-01-10     66   3
8        b 2013-01-07     13   2
9        c 2013-01-07     27   4
10       c 2013-01-01     39   3
11       c 2013-01-03      2   4
12       c 2013-01-02     39   3
13       d 2013-01-07     87   3
14       d 2013-01-04     35   4
15       d 2013-01-08     49   1
16       d 2013-01-05     60   2

My dumb way of doing this. Is there any smarter to do this? I am thinking to use plyr. just direct me to the right package or way.

dat1<-subset(dat,date=="2013-01-01")
colnames(dat1)[c(3,4)]<-c("2013-01-01.amount","2013-01-01.qty")
dat1<-dat1[-2]
dat2<-subset(dat,date=="2013-01-02")
colnames(dat2)[c(3,4)]<-c("2013-01-02.amount","2013-01-02.qty")
dat2<-dat2[-2]
dat3<-subset(dat,date=="2013-01-03")
colnames(dat3)[c(3,4)]<-c("2013-01-03.amount","2013-01-03.qty")
dat3<-dat3[-2]

new.dat<-merge(dat1,dat2,by="product",all.x=T,all.y=T)
new.dat<-merge(new.dat,dat3,by="product",all.x=T,all.y=T)
new.dat[is.na(new.dat)]<-0

new.dat
  product 2013-01-01.amount 2013-01-01.qty 2013-01-02.amount 2013-01-02.qty
1       a                 0              0                 0              0
2       b                 0              0                 0              0
3       c                39              3                39              3
  2013-01-03.amount 2013-01-03.qty
1                72              2
2                94              4
3                 2              4
pnuts
  • 58,317
  • 11
  • 87
  • 139
chee.work.stuff
  • 326
  • 2
  • 14
  • It would create n rows and 61 columns. Thanks. – chee.work.stuff Nov 12 '13 at 08:42
  • 5
    What have you attempted? What packages did you look at? Did you do any research for the solution yourself? What functions do you think *might* be useful here? Please show a little willing. This is not a *me give data, you write code* site. – Simon O'Hanlon Nov 12 '13 at 08:46
  • I am using plyr. No luck yet. was trying alot of ways. I found a dumb way of doing it by subsetting day out and rename it manually. And join them back. I just need a smarter way to do it. Sorry. – chee.work.stuff Nov 12 '13 at 08:52
  • 1
    Please add what you tried to the question, along with why it didn't work and what you expected. Use a toy example if your real data is too big. People are much more willing to help when there is something to work off of. – Simon O'Hanlon Nov 12 '13 at 09:21
  • 2
    +1 for updating the question, good work! – Simon O'Hanlon Nov 12 '13 at 10:30

2 Answers2

4

Considering the current form of your data, just use reshape from base R:

> reshape(dat, direction = "wide", idvar="product", timevar="date")
   product amount.2013-01-03 qty.2013-01-03 amount.2013-01-04 qty.2013-01-04 amount.2013-01-06
1        a                72              2               100              1                39
5        b                94              4                NA             NA                NA
9        c                 2              4                NA             NA                NA
13       d                NA             NA                35              4                NA
   qty.2013-01-06 amount.2013-01-10 qty.2013-01-10 amount.2013-01-09 qty.2013-01-09
1               4                78              3                NA             NA
5              NA                66              3                22              1
9              NA                NA             NA                NA             NA
13             NA                NA             NA                NA             NA
   amount.2013-01-07 qty.2013-01-07 amount.2013-01-01 qty.2013-01-01 amount.2013-01-02
1                 NA             NA                NA             NA                NA
5                 13              2                NA             NA                NA
9                 27              4                39              3                39
13                87              3                NA             NA                NA
   qty.2013-01-02 amount.2013-01-08 qty.2013-01-08 amount.2013-01-05 qty.2013-01-05
1              NA                NA             NA                NA             NA
5              NA                NA             NA                NA             NA
9               3                NA             NA                NA             NA
13             NA                49              1                60              2

Alternatively, you may consider the "reshape2" package:

library(reshape2)
datL <- melt(dat, id.vars=c("product", "date"))
dcast(datL, product ~ date + variable, value.var="value")
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
3

Try this:

#Dummy data
df <- read.table(text="
BizDate Description TotalAmount TotalQty
2013-09-01 NIN 11.8 2
2013-09-01 BREACOL 15.8 2
2013-09-02 AFRICAN 8.5 1
2013-09-03 AFRICAN 8.2 1
2013-09-01 THREE 21.0 14
2013-09-01 SEAHORSE 4.4 4
2013-09-05 AFRICAN 8.5 1
2013-09-06 AFRICAN 8.2 1",
                 header=TRUE)

library(reshape2)

dcast(df,Description~BizDate,value="TotalAmount")
#Description 2013-09-01 2013-09-02 2013-09-03 2013-09-05 2013-09-06
#1     AFRICAN         NA        8.5        8.2        8.5        8.2
#2     BREACOL       15.8         NA         NA         NA         NA
#3         NIN       11.8         NA         NA         NA         NA
#4    SEAHORSE        4.4         NA         NA         NA         NA
#5       THREE       21.0         NA         NA         NA         NA
dcast(df,Description~BizDate,value="TotalQty")
#Description 2013-09-01 2013-09-02 2013-09-03 2013-09-05 2013-09-06
#1     AFRICAN         NA          1          1          1          1
#2     BREACOL          2         NA         NA         NA         NA
#3         NIN          2         NA         NA         NA         NA
#4    SEAHORSE          4         NA         NA         NA         NA
#5       THREE         14         NA         NA         NA         NA
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • @chee.work.stuff, I would not call "reshape" the right package for it, since it is not actively under development and has a package that follows it. zx8754, I would suggest updating to "reshape2". – A5C1D2H2I1M1N2O1R2T1 Nov 12 '13 at 09:56
  • @AnandaMahto updated. `acast vs dcast` http://stackoverflow.com/questions/12468843/could-not-find-function-cast-despite-reshape2-installed-and-loaded – zx8754 Nov 12 '13 at 10:20
  • 2
    +1 for updating. I know it's just a matter of one letter, but there are some pretty significant differences between the two, and for more "future proof" code and better efficiency, I thought I would make the recommendation to you :-) – A5C1D2H2I1M1N2O1R2T1 Nov 12 '13 at 10:28