0

I am having trouble subsetting data based on different attributes in different columns. Here is a dummy data set with species, area where it was found, and time (already in POSIXct).

SP Time Area
B 07:22 1
F 09:22 4
A 09:22 1
C 08:17 3
D 09:20 1
E 06:55 4
D 09:03 1
E 09:12 2
F 09:45 1
B 09:15 1

I need to subset the rows that have SP==A, plus all other species occurring in the same area (in this case 1), within a time window of +30 and -30 minutes returning this:

SP Time Area
A 09:22 1
D 09:20 1
D 09:03 1 
F 09:45 1
B 09:15 1

I can't get past the conditional statement of this 1-hour window, should I use a for loop here, or is there a simpler way of subsetting this? Many thanks in advance.

Karl
  • 67
  • 4
  • We appreciate the clearly stated question w/ the reproducible example, +1. Outside of this example, will there be cases where there are multiple `A`'s, & so multiple sets of conditions (when including `Areas` & `Time` windows)? – gung - Reinstate Monica Sep 08 '13 at 23:54
  • Yes, Species A would appear repeatedly throughout the data set. Therefore the output would include other time windows and other areas. – Karl Sep 09 '13 at 00:10
  • Please show us what you have done so far. – Metrics Sep 09 '13 at 00:27

1 Answers1

2

Reproducing just your initial result with one A value, assuming your data is called dat, can be done like so:

with(dat,dat[
  (
    SP=="A" |
    Area==Area[SP=="A"]
  ) &
  abs(difftime(Time,Time[SP=="A"],units="mins")) <= 30,
]
)

Result:

   SP                Time Area
3   A 2013-09-09 09:22:00    1
5   D 2013-09-09 09:20:00    1
7   D 2013-09-09 09:03:00    1
9   F 2013-09-09 09:45:00    1
10  B 2013-09-09 09:15:00    1

To account for multiple occurrences of A, things get a touch more complex:

with(dat,dat[
  (
    SP=="A" |
    Area %in% Area[SP=="A"]
  ) & 
  apply(
    sapply(Time[SP=="A"],
    function(x) abs(difftime(Time,x,units="mins"))<=30 ),1,any
  )
,]
)

Though I'm sure there is probably a simplification possible here somewhere.

thelatemail
  • 91,185
  • 12
  • 128
  • 188
  • This code works well for selecting the co-occurrences within the 1-hour time window, but the output gives co-occurrences in different areas. Despite that, the solution for the time window is quite elegant and the biggest hurdle has been cleared. I will be working on clearing the area-co-occurrence. – Karl Sep 09 '13 at 18:55