0

I have a dataset in which one column represents monthly Date:from 02/01/2004 to 09/01/2008, i have to create a dummy for the Dates in 2008. I tried to use:

dummy <- as.numeric(Date >= 01/01/2008),

but R said me that:

">= is not meaningful for factors"

hence i tried to transform the factor variable Date in a numeric one, but all my Dates disappeared, substituted with some random numbers.

RLave
  • 8,144
  • 3
  • 21
  • 37
agripaper
  • 3
  • 2

2 Answers2

1

This creates some data:

dat <- data.frame(
  date = c("01/01/2017", "02/01/2017", "01/01/2018")
)

Now first we get the correct date format, then we create the dummy:

dat$date <- strptime(as.character(dat$date), "%d/%m/%Y") # correct date format
dat$date <- format(dat$date, "%Y-%m-%d") # change to Date variable

# create dummy:
dat$dummy <- 0 
dat$dummy[which(dat$date >= "2018-01-01")] <- 1

Output:

        date dummy
1 2017-01-01     0
2 2017-01-02     0
3 2018-01-01     1
RLave
  • 8,144
  • 3
  • 21
  • 37
  • i tried doing what u said me, but when i use: dat$date <- strptime(as.character(dat$date), "%d/%m/%Y") # correct date format dat$date <- format(dat$date, "%Y-%m-%d") # change to Date variable My dataset lose its dates and become a table of NA, so when i create the dummy is all 0 because my table han not dates anymore. – agripaper Sep 04 '18 at 09:47
  • You should add to your post an example of your data. Like copying the output from `dput(my_data)`. – RLave Sep 04 '18 at 09:51
  • This is an ex of my data: _structure(list(Date= c("01/01/2005", "01/01/2006", "01/01/2007", "01/01/2008", "02/01/2004", "02/01/2005", "02/01/2006", "02/01/2007"), class = "factor"), LB = c(86.71, 82.86, 73.39, 75.65, 75.25, 70.1, 73.24, 80.18), CAC40 = c(3730.36, 3625.22, 3677.77, 3671.49, 3732.5, 3654.4, 3638, 3682), DAX = c(4018.16, 3856.7, 3978.26, 3921.49, 4065.4, 3891.2, 3838.5, 3924.5), DOW = c(10588.22, 10354.96, 10233.8, 10203.79, 10437, 10138.7, 10125.8, 10131), EURUSD = c(1.25, 1.23, 1.2, 1.22, 1.22, 1.2, 1.2, 1.23), BRENT = c(32.22, 32.77, 34.48, 36.61, 34.48, 40.02, 40.63, 46.08))_ – agripaper Sep 04 '18 at 10:11
  • I think the problem is that you have dates `as.factor`, are you importing this data from a csv or something? If so, use `stringAsFactors = FALSE` when importing. (example: `read.csv("file.csv", stringAsFactors = FALSE)`) – RLave Sep 04 '18 at 10:16
  • thanks now i solved the problem of the dates, i did all u suggested me, until the creation of a 0 dummy si all fine, when i did: _dat$dummy[which(dat$date >= "2018-01-01")] <- 1_ , nothing happen so have not my dummies for 2008 – agripaper Sep 04 '18 at 10:27
  • that's because the year is 2018 in the `which`, change to `which(dat$date >= "2008-01-01")` – RLave Sep 04 '18 at 10:31
  • thanks a lot, now i have what i was looking for! last question, how can i add now this dummy column to my original dataset? – agripaper Sep 04 '18 at 10:34
  • now in R i have two dataframes, my initial data(date,LB;CAC40...) and another dataframe with two columns: the first for dates and the second for the dummies i created, i ask you if is possible to add the columns of dummies to the first dataframe so i will have only one dataframe to work on. – agripaper Sep 04 '18 at 12:56
  • You should update directly on your dataframe, just adapt my code above to your case. – RLave Sep 04 '18 at 13:14
0

One line command, using @RLave's answer:

dat$dummy <- as.numeric(dat$date >= "2018-01-01")
bttomio
  • 2,206
  • 1
  • 6
  • 17