2

I have a data set in R that has values in hours, minutes, and seconds format. However, some values only have hours and minutes, some only have minutes and seconds, some only have minutes, and some only have seconds. It's also not formatted very favorably. Sample data can be found below:

example <- as.data.frame(c("22h28m", "17m7s", "15m", "14s"))

I'd like to convert it to a POSIXct format like the below, with the goal of adding/subtracting time:

Column Title
22:28:00
00:17:07
00:15:00
00:00:14

I've tried as POSIXct() and strptime() formulas, but to no avail. Any help would be greatly appreciated - thanks!

maudib528
  • 51
  • 1
  • 7

1 Answers1

2

Maybe parse_date_time from lubridate?

library("lubridate")
x <- c("22h28m", "17m7s", "15m", "14s")
y <- parse_date_time(x, orders = c("%Hh%Mm%Ss", "%Hh%Mm", "%Hh%Ss", "%Mm%Ss", "%Hh", "%Mm", "%Ss"), exact = TRUE)
y
## [1] "0000-01-01 22:28:00 UTC"
## [2] "0000-01-01 00:17:07 UTC"
## [3] "0000-01-01 00:15:00 UTC"
## [4] "0000-01-01 00:00:14 UTC"

To get numbers of seconds since midnight, you could do:

y0 <- floor_date(y, unit = "day")
dy <- y - y0
dy
## Time differences in secs
## [1] 80880  1027   900    14

Then you could add dy to any reference time specified as a length-1 POSIXct object. For example, the most recent midnight in the current time zone:

y0 <- floor_date(now(), unit = "day")
y0 + dy
## [1] "2022-02-03 22:28:00 EST"
## [2] "2022-02-03 00:17:07 EST"
## [3] "2022-02-03 00:15:00 EST"
## [4] "2022-02-03 00:00:14 EST"

Update

After reading the documentation a bit more carefully, I am realizing that lubridate implements a way to obtain dy directly.

dy <- duration(toupper(x))
dy
## [1] "80880s (~22.47 hours)"  "1027s (~17.12 minutes)" "900s (~15 minutes)"     "14s" 

Then you can do y0 + dy as above to obtain a POSIXct object, and, if you like,

strftime(y0 + dy, "%T")
## [1] "22:28:00" "00:17:07" "00:15:00" "00:00:14"

to obtain a character vector listing the times without dates or time zones.

Mikael Jagan
  • 9,012
  • 2
  • 17
  • 48