4

I have a dataframe, df, which has factor variable for date in the following format:

2015-12-15 10:00:00
2015-12-19 12:00:00
2015-12-20 20:00:00

It is hourly data. The problem arises when looking at midnight, 00:00:00, because it doesn't appear the hour. It just says:

21/12/2015

So as you see, it only says the day but it lacks the hour. So I use strptime to convert to a date format using:

df$date <- strptime(df$date,"%d/%m/%Y %H:%M")

It all works fine for all the hours and days except for any day at midnight, 00:00:00, which returns:

NA

I'd really appreciate some help as I've been looking at previous posts in StackOverflow and other forums but I havent' managed to figure out the solution for this specific problem yet.

Henrik
  • 65,555
  • 14
  • 143
  • 159
adrian1121
  • 904
  • 2
  • 9
  • 21
  • 2
    Those can't be the format of your date variables in your data.frame. If so, the format string `"%d/%m/%Y %H:%M"` shouldn't work for any of them. that says "day/month/year hour:min", not "year-month-day hour:min:sec" Be sure to actually provide a reproducible example. – MrFlick May 07 '16 at 14:12
  • 1
    I think the problem is just the string representation of POSIXct dates... when you print a POSIXct object, by default it hides the time portion when it's 00:00:00, but the information is still there... just try to print it with `format(dates,"%Y/%m/%d %H:%M")` – digEmAll May 07 '16 at 14:15

2 Answers2

8

From R's strptime documentation (emphasis added):

format

A character string. The default for the format methods is "%Y-%m-%d %H:%M:%S" if any element has a time component which is not midnight, and "%Y-%m-%d" otherwise. If options("digits.secs") is set, up to the specified number of digits will be printed for seconds.

So the information is still there, you just need to format it to print it out with the time components.

> midnight <- strptime("2015-12-19 00:00:00","%Y-%m-%d %H:%M")
> midnight
[1] "2015-12-19 EST"
> format(midnight,"%Y/%m/%d %H:%M")
[1] "2015/12/19 00:00"
Community
  • 1
  • 1
Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
3

If we have a vector like "v1", by using strptime we get NA for those elements that don't have the correct format

strptime(v1,  "%d/%m/%Y %H:%M:%S", tz = "UTC")
#[1] "2015-12-19 12:00:00 UTC" NA  

One way to correct this will be to paste the "00:00:00" string for those that doesn't have that

v1[!grepl(":", v1)] <- paste(v1[!grepl(":", v1)], "00:00:00") 
strptime(v1,  "%d/%m/%Y %H:%M:%S", tz = "UTC")
#[1] "2015-12-19 12:00:00 UTC" "2015-12-19 00:00:00 UTC"

Or if we use lubridate, the parse_date_time can take multiple formats

library(lubridate)
parse_date_time(v1, guess_formats(v1, c("%d/%m/%Y %H:%M:%S", "%d/%m/%Y")))
#[1] "2015-12-19 12:00:00 UTC" "2015-12-19 00:00:00 UTC"

data

v1 <- c("19/12/2015 12:00:00", "19/12/2015") 
akrun
  • 874,273
  • 37
  • 540
  • 662