0

I have a df like below and I am trying to convert the datettime column to a datetime format.

ID <- c("A","A","B","B")
datettime <- c("2015-12-03T13:04:06-06:00","2015-12-03T13:54:06-06:00","2015-12-03T16:04:06-06:00","2015-12-03T19:54:06-06:00")
df <- data.frame(ID,datettime)

the datettime column in my dataset is a character. I am trying to convert it to a datetime format but unable to get it right.

library(lubridate)
df$datettime <- ymd_hms(df$datettime)
#The problem here is the hours,minutes and seconds get messed up. 

df$datettime <- as.POSIXct(df$datettime,format="%Y%m%d %H%M%S")
#This just ouputs NA's in the column

My desired output is

  ID           datettime
1  A 2015-12-03 13:04:06
2  A 2015-12-03 13:54:06
3  B 2015-12-03 16:04:06
4  B 2015-12-03 19:54:06

Any help would be appreciated.

Sharath
  • 2,225
  • 3
  • 24
  • 37
  • 2
    You almost got it: `df$datettime <- as.POSIXct(df$datettime,format="%Y-%m-%dT%H:%M:%S")`. BTW: what is this funny `-06:00` at the end of each string? Have never seen this format. But luckily the format string that I have given ignores this part anyhow. – cryo111 Dec 03 '15 at 19:37
  • That funny thing is because of the AM & PM that comes while I am parsing this column from JSON. Anyway thanks. – Sharath Dec 03 '15 at 19:43

2 Answers2

2

also try this:

df$datettime <- paste(substr(df$datettime,0,10),substr(df$datettime,12,19))
library(lubridate)
df$datettime <- ymd_hms(df$datettime)
road_to_quantdom
  • 1,341
  • 1
  • 13
  • 20
  • It is not necessary to extract the string in two parts. `df$datettime <- substr(as.character(df$datettime), 1, 19)` is enough and lubridate seems to be able to cope with the `T` between date and time. – Stibu Dec 03 '15 at 19:45
  • yeah i just noticed that after seeing other answers posted. I didn't think they would deal with the "T" and "-6:00" very well but I guess they do – road_to_quantdom Dec 03 '15 at 19:46
  • I like this solution too with lubridate. Thanks road_to_quantdom and @Stibu – Sharath Dec 03 '15 at 19:49
  • Actually, one does need to remove the `-06:00`. If it is left intact, lubridate adds 6 hours to the times. – Stibu Dec 03 '15 at 19:49
1

you forgot to add a dash and colon between the separators in your date. This will give you your desired output.

# original code from above:

ID <- c("A","A","B","B")
datettime <- c("2015-12-03T13:04:06-06:00","2015-12-03T13:54:06-06:00","2015-12-03T16:04:06-06:00","2015-12-03T19:54:06-06:00")
df <- data.frame(ID,datettime)

# convert to a POSIXct object
df$datettime <- as.POSIXct(df$datettime,format="%Y-%m-%dT%H:%M:%S")

df
  ID           datettime
1  A 2015-12-03 13:04:06
2  A 2015-12-03 13:54:06
3  B 2015-12-03 16:04:06
4  B 2015-12-03 19:54:06
s_scolary
  • 1,361
  • 10
  • 21
  • 3
    This answer does not work. It should be `df$datettime <- as.POSIXct(df$datettime,format="%Y-%m-%dT%H:%M:%S")`. And your output does not agree with the output that the OP requested. – Stibu Dec 03 '15 at 19:46
  • Actually. yes. he missed the T in that. – Sharath Dec 03 '15 at 19:48
  • 1
    @Sharath this answer is incorrect... the times do not match up with your desired output – road_to_quantdom Dec 03 '15 at 19:51
  • 1
    Your last approach with `as.POSIXct` and without the `T` separator gives `NA`s on my machine. – cryo111 Dec 03 '15 at 20:51
  • 2
    Have you actually run the first three lines of your code (i.e., the definition of the OP's dataframe) again before executing your last approach `df$datettime <- as.POSIXct(df$datettime,format="%Y-%m-%d %H:%M:%S")`? Otherwise, it wouldn't make much sense to run this line of code on data that is already properly formatted by your `lubridate` solution. :) – cryo111 Dec 03 '15 at 21:43
  • Thanks for the comment @cryo111. I didn't realize that lubridate was adding 6 hours to the times. I ended up removing the lubridate section altogether as POSIXct was able to deal with the timeshift problem. – s_scolary Dec 04 '15 at 13:55