0

Problem:

I am trying to filter my dataframe by specific datetimes that are dependent on an ID vector.

Specifically, for observations where df$id == "A", I want to remove rows between 2017-08-05 00:20:00 and 2017-08-10 13:55:00. However, for observations where df$id == "B", I want to remove rows between a different time interval, 2017-08-05 00:30:00 and 2017-08-10 13:55:00.

Example dataframe:

date <- as.POSIXct(c("2017-08-04 16:40:00","2017-08-05 00:20:00","2017-08-10 13:55:00","2017-08-15 08:35:00", "2017-08-04 17:20:00","2017-08-05 00:30:00","2017-08-10 13:55:00","2017-08-15 09:30:00"), format = "%Y-%m-%d %H:%M:%S")
value <- as.numeric(c(1, 2, 3, 4, 1, 2, 3, 4))
id <- as.factor(c("A","A","A","A","B","B","B","B"))
df <- data.frame(date, value, id)

Desired output:

               date value id
2017-08-04 16:40:00     1  A
2017-08-15 08:35:00     4  A
2017-08-04 17:20:00     1  B
2017-08-15 09:30:00     4  B

Thanks!

Edit: if your dataframe has a third category (df$id == "C") that you want to preserve in its entirety:

df[which(
  (df$id == "A" & (df$date < "2017-08-05 00:20:00" | df$date > "2017-08-10 13:55:00")) |
    (df$id == "B" & (df$date < "2017-08-05 00:30:00" | df$date > "2017-08-10 13:55:00"))
 | df$id == "C"), ]
philiporlando
  • 941
  • 4
  • 19
  • 31

1 Answers1

2
df[which(
        (df$id == "A" & (df$date < "2017-08-05 00:20:00" | df$date > "2017-08-10 13:55:00")) |
        (df$id == "B" & (df$date < "2017-08-05 00:30:00" | df$date > "2017-08-10 13:55:00"))
      ), ]
Brendan
  • 101
  • 3
  • Thanks! This is very close, but the first date within the `df$id == "B"` should be `2017-08-05 00:30:00` instead of `2017-08-04 17:20:00`. – philiporlando Oct 10 '17 at 23:18
  • This solution worked for my example, but doesn't do exactly what I would want for my actual data. What would you do if you wanted to apply the same date filters for `A` and `B`, but you wanted to keep all of the data for a different id, say `df$id == "C"`? – philiporlando Oct 11 '17 at 00:16