4

I have been looking around but I still couldn't find a way to subset my dataframe by time, here is the sample data:

 Duration            End Date          Start Date
      228 2013-01-03 09:10:00 2013-01-03 09:06:00
     1675 2013-01-04 17:34:00 2013-01-04 17:06:00
      393 2013-01-04 17:54:00 2013-01-04 17:48:00
      426 2013-01-04 11:10:00 2013-01-04 11:03:00
      827 2013-01-01 16:13:00 2013-01-01 15:59:00
      780 2013-01-01 16:13:00 2013-01-01 16:00:00

The End Date and Start Date are in POSIXct format, and here is what I have tried if I only what times between 8:00 to 9:30.

tm1 <- as.POSIXct("08:00", format = "%H:%M")
tm2 <- as.POSIXct("09:30", format = "%H:%M")
df.time <- with(df, df[format('Start Date', '%H:%M')>= tm1 & format('End Date', '%H:%M')< tm2, ])

but this returns an error. I have also tried this, but it didn't work as well.

df.time <- subset(df, format('Start Date', '%H:%M') >= '8:00' & format('End Date', '%H:%M') < '9:30'))

if anybody tell me what am I doing wrong? Thanks

user3773503
  • 161
  • 2
  • 5
  • 12

1 Answers1

5

Assuming that the start and end dates are always the same and only the times differ and you want those rows for which the time starts at or after 8:00 and ends before 9:30, convert the date/time values to characters strings of the form HH:MM and compare:

subset(DF, format(`Start Date`, "%H:%M") >= "08:00" & 
           format(`End Date`, "%H:%M") < "09:30")

giving:

  Duration            End Date          Start Date
1      228 2013-01-03 09:10:00 2013-01-03 09:06:00

Note: We used the following for DF. (Next time please use dput to provide your data in reproducible form.)

DF <- structure(list(Duration = c(228L, 1675L, 393L, 426L, 827L, 780L
), `End Date` = structure(c(1357222200, 1357338840, 1357340040, 
1357315800, 1357074780, 1357074780), class = c("POSIXct", "POSIXt"
), tzone = ""), `Start Date` = structure(c(1357221960, 1357337160, 
1357339680, 1357315380, 1357073940, 1357074000), class = c("POSIXct", 
"POSIXt"), tzone = "")), .Names = c("Duration", "End Date", "Start Date"
), row.names = c(NA, -6L), class = "data.frame")
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341