1

I have a a number of periods which I would like to split by 5 minute intervals (00:00, 00:05, 00:10, 00:15, ...), if they span across such 5 minute intervals.

How do I turn

periods_raw <- read.table(header=TRUE, text="
COLOR           UTC_DateTime_Start      UTC_DateTime_End
Green           2022-06-15 14:56:23     2022-06-15 15:03:11
Purple          2022-06-15 15:03:11     2022-06-15 15:14:48
Red             2022-06-15 16:27:11     2022-06-15 16:29:48
")

into

periods_split <- read.table(header=TRUE, text="
COLOR           UTC_DateTime_Start      UTC_DateTime_End
Green           2022-06-15 14:56:23     2022-06-15 15:00:00
Green           2022-06-15 15:00:00     2022-06-15 15:03:11
Purple          2022-06-15 15:03:11     2022-06-15 15:05:00
Purple          2022-06-15 15:05:00     2022-06-15 15:10:00
Purple          2022-06-15 15:10:00     2022-06-15 15:14:48
Red             2022-06-15 16:27:11     2022-06-15 16:29:48
")
user438383
  • 5,716
  • 8
  • 28
  • 43
Benisburgers
  • 352
  • 4
  • 17

1 Answers1

3

If the 'UTC_DateTime' columns are not Datetime class (POSIXct), convert (ymd_hms from lubridate), then loop over the columns 'start', 'end' (suffix), create a sequence by '5 minute', use floor_date for the elements that are not the first or the last, create a tibble by removing the last and first observation from the sequence to create the new 'Start', 'End' columns and unnest the list column

library(dplyr)
library(tidyr)
library(lubridate)
library(purrr)
periods_raw %>% 
  mutate(across(starts_with("UTC_DateTime"), ymd_hms)) %>% 
  mutate(new = map2(UTC_DateTime_Start, UTC_DateTime_End, ~ {
     v1 <- c(seq(.x, .y, by = "5 min"), .y)
     v1[-c(1, length(v1))] <- floor_date(v1[-c(1, length(v1))], "5 min")
   tibble(UTC_DateTime_Start = v1[-length(v1)], UTC_DateTime_End = v1[-1]) 
    }), .keep = "unused") %>% 
  unnest(new)

-output

# A tibble: 6 × 3
  COLOR  UTC_DateTime_Start  UTC_DateTime_End   
  <chr>  <dttm>              <dttm>             
1 Green  2022-06-15 14:56:23 2022-06-15 15:00:00
2 Green  2022-06-15 15:00:00 2022-06-15 15:03:11
3 Purple 2022-06-15 15:03:11 2022-06-15 15:05:00
4 Purple 2022-06-15 15:05:00 2022-06-15 15:10:00
5 Purple 2022-06-15 15:10:00 2022-06-15 15:14:48
6 Red    2022-06-15 16:27:11 2022-06-15 16:29:48

data

periods_raw <- structure(list(COLOR = c("Green", "Purple", "Red"), UTC_DateTime_Start = c("2022-06-15 14:56:23", 
"2022-06-15 15:03:11", "2022-06-15 16:27:11"), UTC_DateTime_End = c("2022-06-15 15:03:11", 
"2022-06-15 15:14:48", "2022-06-15 16:29:48")), class = "data.frame", row.names = c(NA, 
-3L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Weirdly I get this error: Error in seq.default(.x, .y, by = "5 min") : 'from' must be a finite number In addition: Warning message: In seq.default(.x, .y, by = "5 min") : NAs introduced by coercion – Benisburgers Jul 25 '22 at 14:58
  • 1
    Nevermind! Had to update my libraries. – Benisburgers Jul 25 '22 at 15:04
  • It seems that e.g. 12:35:00 - 12:40:30, does not get rounded: You still get 12:35:00 - 12:40:30. However, I would like to get to periods then: 1. 12:35:00 - 12:40:00 2. 12:40:00 - 12:40:30 How do I do that? – Benisburgers Jul 26 '22 at 07:12
  • @Benisburgers can you add the example in your post that didn't work – akrun Jul 26 '22 at 22:38