I am looking to combine lubridate intervals such that if they overlap, take the min value from the internal first in time and the max value from the internal last in time and summarise to create a new interval that spans the entire period. Here is a reprex:
library(lubridate, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
library(tibble)
dat <- tibble(
animal = rep(c("elk", "wolf", "moose"), each = 2),
date_interval = c(
interval(as.Date("2020-04-01"), as.Date("2020-04-05")),
interval(as.Date("2020-04-10"), as.Date("2020-04-15")),
interval(as.Date("2020-03-01"), as.Date("2020-04-01")),
interval(as.Date("2020-02-15"), as.Date("2020-03-15")),
interval(as.Date("2020-10-01"), as.Date("2020-11-01")),
interval(as.Date("2020-09-15"), as.Date("2020-10-15"))
)
)
dat
#> # A tibble: 6 x 2
#> animal date_interval
#> <chr> <Interval>
#> 1 elk 2020-04-01 UTC--2020-04-05 UTC
#> 2 elk 2020-04-10 UTC--2020-04-15 UTC
#> 3 wolf 2020-03-01 UTC--2020-04-01 UTC
#> 4 wolf 2020-02-15 UTC--2020-03-15 UTC
#> 5 moose 2020-10-01 UTC--2020-11-01 UTC
#> 6 moose 2020-09-15 UTC--2020-10-15 UTC
Ok so in the wolf
and moose
levels, we have overlapping intervals. Assuming that this is the same wolf and moose something like would double count the days:
dat %>%
group_by(animal) %>%
mutate(time = time_length(date_interval)) %>%
summarise(time_cumu = sum(time))
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 3 x 2
#> animal time_cumu
#> <chr> <dbl>
#> 1 elk 777600
#> 2 moose 5270400
#> 3 wolf 5184000
This is the type of output I would like to get that summarises the overlapping intervals:
tibble(
animal = c("elk", "elk", "wolf", "moose"),
date_interval = c(
interval(as.Date("2020-04-01"), as.Date("2020-04-05")),
interval(as.Date("2020-04-10"), as.Date("2020-04-15")),
interval(as.Date("2020-02-15"), as.Date("2020-04-01")),
interval(as.Date("2020-09-15"), as.Date("2020-11-01"))
)
)
#> # A tibble: 4 x 2
#> animal date_interval
#> <chr> <Interval>
#> 1 elk 2020-04-01 UTC--2020-04-05 UTC
#> 2 elk 2020-04-10 UTC--2020-04-15 UTC
#> 3 wolf 2020-02-15 UTC--2020-04-01 UTC
#> 4 moose 2020-09-15 UTC--2020-11-01 UTC
Ideas?