How to select data by the hour

Question

I've been trying to split the data in 30 minute intervals, and I have not been able to find a solution to this problem, the date and time is a date_time variable. I just want to be able to make a df based off the time the date is not important

I have tried just splitting the data by formatting the date to just the time, but that also was not working.

this is what the df looks like

    Date_Time              S     C     P
    2016-08-02 21:14:52   20     1     1
    2016-08-02 21:26:37   35     1     2
    2016-09-07 21:31:33   28     1     8
    2016-08-25 21:46:16   23     3     4 
    2016-08-24 21:54:23   40     1     6

If I were to set the df to be between 21:00:00 - 21:30:00 it would look like:

    Date_Time              S     C     P
    2016-08-02 21:14:52   20     1     1
    2016-08-02 21:26:37   35     1     2

I'm new with r and coding so any help would be appreciated!

Related - https://stackoverflow.com/questions/42378533/determine-if-24-hour-datetime-is-within-interval/42378962 or maybe even https://stackoverflow.com/questions/41465131/r-how-to-filter-a-timestamp-by-hour-and-minute — thelatemail, May 24 '19 at 02:29
Stealing from the answer over there, I'd do something like - `fd <- format(dat$Date_Time, "%H:%M"); dat[fd >= "21:00" & fd <= "21:30",]` — thelatemail, May 24 '19 at 02:43

score 0 · Answer 1 · answered May 24 '19 at 02:35

As the date is not important and you are interested only in time component, you can change the date to todays date. Also it looks like you are interested in half an hour interval starting from 00:00:00 to 00:30:00 and so on. We can create a sequence of POSIXct time intervals for the entire day and split the data based on that.

df$Date_Time1 <- as.POSIXct(format(df$Date_Time, paste0(Sys.Date(), "%T")))

split(df[-5], droplevels(cut(df$Date_Time1, 
   breaks  = seq(as.POSIXct("00:00:00", format = "%T"), 
                 as.POSIXct("23:59:59", format = "%T"), by = "30 mins"))))


#$`2019-05-24 21:00:00`
#        Date_Time  S C P
#1 2016-08-02 21:14:52 20 1 1
#2 2016-08-02 21:26:37 35 1 2

#$`2019-05-24 21:30:00`
#            Date_Time  S C P
#3 2016-09-07 21:31:33 28 1 8
#4 2016-08-25 21:46:16 23 3 4
#5 2016-08-24 21:54:23 40 1 6

This will return a list of dataframes where each dataframe are the rows which lie in that time interval. This is assuming your Date_Time column is already of POSIXct class. If it is not you need to change it first by doing.

df$Date_Time <- as.POSIXct(df$Date_Time)

score 0 · Answer 2 · answered May 24 '19 at 04:01

Here is one option with tidyverse. We can floor the 'Date_Time' based on 30 minute interval and use that to split into list of data.frames

library(lubridate)
library(tidyverse)
df1 %>% 
  mutate(grp = format(floor_date(ymd_hms(Date_Time), '30 min'), '%H:%M:%S')) %>% 
  group_split(grp, keep = FALSE)
#[[1]]
# A tibble: 2 x 4
#  Date_Time               S     C     P
#  <chr>               <int> <int> <int>
#1 2016-08-02 21:14:52    20     1     1
#2 2016-08-02 21:26:37    35     1     2

#[[2]]
# A tibble: 3 x 4
#  Date_Time               S     C     P
#  <chr>               <int> <int> <int>
#1 2016-09-07 21:31:33    28     1     8
#2 2016-08-25 21:46:16    23     3     4
#3 2016-08-24 21:54:23    40     1     6

data

df1 <- structure(list(Date_Time = c("2016-08-02 21:14:52", "2016-08-02 21:26:37", 
"2016-09-07 21:31:33", "2016-08-25 21:46:16", "2016-08-24 21:54:23"
), S = c(20L, 35L, 28L, 23L, 40L), C = c(1L, 1L, 1L, 3L, 1L), 
    P = c(1L, 2L, 8L, 4L, 6L)), class = "data.frame", row.names = c(NA, 
-5L))

How to select data by the hour

2 Answers2

data