I have a dataframe with multiple columns where one of the columns is of class POSIXct. I would like to remove rows from my dataframe where the row's date/time (as determined from the POSIXct column) has not been preceded by a date/time within the past 24 hours, NOT including the previous 3 hours.
In Excel, I can do this rather easily by creating a new column like this:
=IF(COUNTIFS(datetimecolumn, "<" & currentdatetime, datetimecolumn, ">" & (currentdatetime-1), datetimecolumn, "<" & (currentdatetime-3/24)) > 0, 1, 0)
and then removing accordingly.
I can see doing a "for-loop" with "if-statements" in R and accomplishing the same task, but I'm wondering if there is a more parsimonious method using, say, data.table or dplyr. Here is an example of what my data looks like with my Excel solution at the far right, where the 0's are keepers, and the 1's are to be removed.
datetime test
7/24/2012 12:15 0 #First point, so no issues
7/24/2012 15:00 0 #Even though this point is within 24 hours of the previous point, it is less than 3 hours, so it's OK
7/24/2012 15:15 0 #Ditto for this point
7/24/2012 15:30 1 #Now this point is out of the three hour window, so it's bad
7/24/2012 16:00 1 #Ditto for this point
7/24/2012 17:00 1 #Ditto for this point
7/24/2012 17:30 1 #Ditto for this point
7/28/2012 20:15 0 #This point has no previous points within 24 hours, so OK
7/29/2012 6:30 1 #This point has a previous point within 24 hours that is also not in a previous 3 hour window, so it's bad
7/30/2012 16:30 0 #This point has no previous points within 24 hours, so OK
7/30/2012 16:45 0
7/30/2012 17:00 0
7/30/2012 17:15 0
7/30/2012 17:30 0
7/30/2012 17:45 0
7/30/2012 18:00 0
7/30/2012 18:15 0
7/31/2012 16:45 1
8/2/2012 20:15 0
8/3/2012 16:00 1
8/4/2012 17:45 0
8/4/2012 18:00 0
8/4/2012 18:30 0
8/4/2012 19:15 0
8/4/2012 19:30 0
8/4/2012 19:45 0
8/4/2012 20:30 0
8/5/2012 9:15 1
8/5/2012 9:30 1
Any help is greatly appreciated. Thank you!
Data, kindly provided by @jeremycg:
data = structure(list(datetime = structure(c(1343146500, 1343156400,
1343157300, 1343158200, 1343160000, 1343163600, 1343165400, 1343520900,
1343557800, 1343680200, 1343681100, 1343682000, 1343682900, 1343683800,
1343684700, 1343685600, 1343686500, 1343767500, 1343952900, 1344024000,
1344116700, 1344117600, 1344119400, 1344122100, 1344123000, 1344123900,
1344126600, 1344172500, 1344173400), class = c("POSIXct", "POSIXt"
), tzone = ""), test = c(0L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 1L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 1L, 1L)), .Names = c("datetime", "test"), row.names = c(NA,
-29L), class = "data.frame")