1

I have a dataset with 10 columns. One of those columns is the date. I want to create dummy variables for every month. How do I go about doing this?

      Date     Col1     Col2  
2017-01-09        v        2
2017-05-01        s        7
2018-03-02        k        9

I can extract the month using lubridate:

df$MONTH<-month(df$Date)

      Date     Col1     Col2     MONTH
2017-01-09        v        2         1
2017-05-01        s        7         5
2018-03-02        k        9         3

How do I transform this to have the dummy variables for each month cbinded to the original?

      Date     Col1     Col2    M1   M2   M3   M4   M5   M6   M7   M8   M9   M10    M11   M12
2017-01-09        v        2     1    0    0    0    0    0    0    0    0   0        0     0
2017-05-01        s        7     0    0    0    0    1    0    0    0    0   0        0     0
2018-03-02        k        9     0    0    1    0    0    0    0    0    0   0        0     0
nak5120
  • 4,089
  • 4
  • 35
  • 94

1 Answers1

2

One option is tabulate on ther 'MONTH' and create the columns

df[paste0("M", 1:12)] <- as.data.frame(t(sapply(df$MONTH, tabulate, 12)))

Or use row/column indexing where the column index is taken from the 'MONTH' and assign those values from a matrix of 0's to 1

m1 <- matrix(0, nrow(df), 12)
m1[cbind(seq_len(nrow(df)), df$MONTH)] <- 1
df[paste0("M", 1:12)] <- m1
df
#        Date Col1 Col2 MONTH M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12
#1 2017-01-09    v    2     1  1  0  0  0  0  0  0  0  0   0   0   0
#2 2017-05-01    s    7     5  0  0  0  0  1  0  0  0  0   0   0   0
#3 2018-03-02    k    9     3  0  0  1  0  0  0  0  0  0   0   0   0
akrun
  • 874,273
  • 37
  • 540
  • 662