4

Someone gave me really bad data in Excel, where the date (such as July 1, 2015) is 20150701 and the time (such as 11:41:23) is 114123. There are over 50,000 rows of data and I need to convert these all into proper date and time objects. These aren't the number of seconds from any epoch, it is just the date or time without the dashes or the colons.

I imported them into a data frame and converted the dates using the ymd() function, but I can't find a function to do that for time, hms() gives me an error:

package(lubridate)
df <- readWorksheetFromFile(file="cktime2012.xls", sheet=1)
df$date <- ymd(df$date)
df$time <- hms(df$time)
# Warning message:
#  In .parse_hms(..., order = "HM", quiet = quiet) :
#   Some strings failed to parse

and I get a data frame that looks like this before running the last line. Once I run the last line, the TIMEIN column turns into all NA's:

DATEIN      TIMEIN  etc...
2012-02-01  200000  etc...
etc...

I need it to look like this for all 50,000 rows. I included POSIXct as a tag, because I don't know if there could be a way to use that to help convert:

DATEIN      TIMEIN      etc...
2012-02-01  20:00:00    etc...
etc...
Zachary Weixelbaum
  • 904
  • 1
  • 11
  • 24
  • If what you're actually trying to do is get the date and time together in a single column in `POSIXct` format, then you should paste together `DATEIN` and `TIMEIN` and convert to POSIXct (using any of the answers to get the formatting right). – eipi10 Jul 01 '15 at 17:23

3 Answers3

4

If TIMEIN is always six characters (i.e., there's a leading zero for times before 10 AM), then you can do this:

df$TIMEIN = paste0(substr(df$TIMEIN,1,2),":",substr(df$TIMEIN,3,4),":", substr(df$TIMEIN,5,6))
df$TIMEIN = hms(df$TIMEIN)
eipi10
  • 91,525
  • 24
  • 209
  • 285
  • when I do that, the first line turns it into the proper XX:XX:XX format, but when I do the second line, all the cells turn into zeros – Zachary Weixelbaum Jul 01 '15 at 17:15
  • 2
    It will be easier to figure out what might be going on if you post some sample data that reproduces the problem. – eipi10 Jul 01 '15 at 17:20
  • All I have is what you see in the question, but your answer works perfectly if you only run the first line of code. I then can take it and merge it with the DATEIN information and it will give me a proper date and time column – Zachary Weixelbaum Jul 01 '15 at 17:28
4

You can try this too to get the specified time, but then you'd have to get rid of the date too.

> as.POSIXct("200000", format="%H%M%S")
[1] "2015-07-01 20:00:00 IST"

Edit- Okay, as.POSIXct() works on date and time. So, to merge the whole into one you can do something like this.

> as.POSIXct("20120201 200000", format="%Y%m%d %H%M%S")
[1] "2012-02-01 20:00:00 IST"
TrigonaMinima
  • 1,828
  • 1
  • 23
  • 35
  • @Frank Is there a way to merge the time with the date from the other row, instead of showing today's date? That could save me from the next step that I was going to need to do – Zachary Weixelbaum Jul 01 '15 at 17:22
  • 1
    Something like this: `ymd_hms(paste(df$DATEIN, gsub("(..)", "\\1 ",df$TIMEIN))` using the answer in Frank's comment. – eipi10 Jul 01 '15 at 17:27
0

Or simpler than the ones above, using the pipes in tidyverse you can get the following:

# make sure you have dates stores as POSIXct 
# call in tidyverse library to make use of pipes and use the code bellow

df_hms <- df %>%
  mutate(time = hms::as.hms(TIMEIN))
GaB
  • 1,076
  • 2
  • 16
  • 29