0

I have a dataframe that looks like this

      Vehicle Model Month Sales
      A   XXY     1    10
      A   XXY     1   100
      A   XXY     2    40
      A   XXY     3    10
      A   YYX     3    10
      A   YYX     3    33
      B   ZZZ     1    50
      B   ZZY     2    60

I want to be able to transform it as follows:

      Vehicle Model 1 2 3 4 5
       A      XXY  2  1  1  0  0
       A      YYX  0  0  2  0  0
       B      ZZZ  1  0  0  0  0
       B      ZZY  0  1  0  0  0

So essentially I want to - Group by 2 fields( Vehicle, Model) and then Count the number of record in the column "Sales" BY Month, and transpose the data so that Month becomes column and Vehicle/Model my rows. Also some models might not have up to 12 months, but I want to display all the columns 1 to 12, leaving 0 if no data available

The dataframe is quite large. Any recommendation? Thanks

Elly
  • 129
  • 2
  • 12
  • The data header should be aligned with columns. At first I thought the last column was "Month Sales". – dracodoc Sep 20 '16 at 13:40

1 Answers1

0

Not sure about the "one step" part of the question. Doing it in several steps doesn't work?

library(data.table)
s <- "      Vehicle Model Month Sales
      A   XXY     1    10
      A   XXY     1   100
      A   XXY     2    40
      A   XXY     3    10
      A   YYX     3    10
      A   YYX     3    33
      B   ZZZ     1    50
      B   ZZY     2    60"
dt <- fread(s)

dt[, sale_count := .N, by = .(Vehicle, Model, Month)]
dt[, Sales := NULL]
dt <- unique(dt)
dcast(dt, Vehicle + Model ~ Month, value.var = "sale_count")

result: there is NA instead of 0 for months not available. You can change it to 0 if you want.

   Vehicle Model  1  2  3
1:       A   XXY  2  1  1
2:       A   YYX NA NA  2
3:       B   ZZY NA  1 NA
4:       B   ZZZ  1 NA NA
dracodoc
  • 2,603
  • 1
  • 23
  • 33