Update: This seems to be very well-described in SQL forums -- how to account for the gaps in-between time ranges (many of which overlap.) So I may have to turn to SQL to quickly solve this problem, but I'm surprised it cannot be done in "R". It would appear that the object used by interval gets almost all the way there, but outside of a slow loop, it seems difficult to apply on a vector-wide analysis. Please do let me know if you have any ideas, but here's a description of the problem and its solution in SQL:
.... What I'd like to do is come up with a list of non-activity time from a log, and then filter down on it to show a minimum amount of time of non-activity.
1/17/2012 0:15 1/17/2012 0:31
1/20/2012 0:21 1/20/2012 0:22
1/15/2013 1:08 1/15/2013 1:10
1/15/2013 1:08 1/15/2013 1:10
1/15/2013 7:39 1/15/2013 7:41
1/15/2013 7:39 1/15/2013 7:41
1/16/2013 1:11 1/16/2013 1:15
1/16/2013 1:11 1/16/2013 1:15
I was going to just lag the end times into the start row and compute the difference, but then it became clear there were overlapping activities. I also tried "price is right" type matching to get the closest end time... except, of course, if things are going on simultaneously, this doesn't guarantee there's no activity from a still-unfinished simultaneous task.
I currently have date-time in and date-time out columns. I am hoping there is a better idea than taking the many millions of entries, and using seq.POSIXt to write every individual minute that has activity? But even that doesn't seem very workable.. But it would seem there would be some easy way to identify gaps of time of a minimum size, whether it be 5 minutes or 30. Any suggestions?