2

I have some dates in a dataframe, and when I use as.Date() to convert them into dates, the years convert into 2020, which isn't really valid because the file only has data up to 2018.

What I have so far:

> fechadeinsc1[2]
[1] "2020-08-15"

> class(fechadeinsc1)
[1] "Date"

> fechainsc[2]
[1] "2017/99/99"

> class(fechainsc)
[1] "character"

As you can see, fechadeinsc1 was converted into a date and fechainsc is the original dataframe which elements are characters. "fechadeinsc1" should give the same year, shouldn't it? Even though days and months aren't valid.

Another example:

> fechadenac1[2]
[1] "2020-12-31"

> class(fechadenac1)
[1] "Date"

> fechanac[2]
[1] "12/31/2016"

> class(fechanac)
[1] "character"

Again, the year changes.

My code:

fechanac <- dat$fecha_nac
fechainsc <- dat$fecha_insc

fechadeinsc1 <- as.Date(fechainsc,tryFormats =c("%d/%m/%y","%m/%d/%y","%y","%d%m%y","%m%d%y"))
fechadenac1 <- as.Date(fechanac,tryFormats =c("%d/%m/%y","%m/%d/%y","%y","%d%m%y","%m%d%y"))

"dat" is the original dataframe which contains information about newborns registered in 2016 and 2017 in Ecuador, if anyone wants the original .csv file please contact me.

Phil
  • 7,287
  • 3
  • 36
  • 66
  • Try with `anydate` from `anytime` `anydate(c("2020-08-15", "12/31/2016")) [1] "2020-08-15" "2016-12-31"` – akrun Aug 15 '20 at 18:33

1 Answers1

1

Based on strptime, referred from as.Date, you should use upper case Y for 4-digit years:

%y Year without century (00--99). On input, values 00 to 68 are prefixed by 20 and 69 to 99 by 19 -- that is the behaviour specified by the 2004 and 2008 POSIX standards, but they do also say ‘it is expected that in a future version the default century inferred from a 2-digit year will change’.

%Y Year with century. [...]

tevemadar
  • 12,389
  • 3
  • 21
  • 49