Edit (2019-06): This problem does not exist anymore, as this issue has been closed and a related feature implemented. If you now run the code with updated packages, it will work.
I'm trying to find overlapping intervals and decided to join the interval data on itself with dplyr::left_join()
so that I could compare intervals with lubridate::int_overlaps()
to every other interval by the same id.
Here's how I expect left_join()
to behave. The two tibbles with three rows cross to form a tibble with 9 rows:
library(tidyverse)
tibble(a = rep("a", 3), b = rep(1, 3)) %>%
left_join(tibble(a = rep("a", 3), c = rep(2, 3)))
Joining, by = "a"
# A tibble: 9 x 3
a b c
<chr> <dbl> <dbl>
1 a 1 2
2 a 1 2
3 a 1 2
4 a 1 2
5 a 1 2
6 a 1 2
7 a 1 2
8 a 1 2
9 a 1 2
And here's how the same code behaves with intervals. I get nine rows but the rows don't cross like they do above:
tibble(a = rep("a", 3), b = rep(make_date(2001) %--% make_date(2002), 3)) %>%
left_join(tibble(a = rep("a", 3), c = rep(make_date(2002) %--% make_date(2003))))
Joining, by = "a"
# A tibble: 9 x 3
a b c
<chr> <S4: Interval> <S4: Interval>
1 a 2001-01-01 UTC--2002-01-01 UTC 2002-01-01 UTC--2003-01-01 UTC
2 a 2001-01-01 UTC--2002-01-01 UTC 2002-01-01 UTC--2003-01-01 UTC
3 a 2001-01-01 UTC--2002-01-01 UTC 2002-01-01 UTC--2003-01-01 UTC
4 a NA--NA NA--NA
5 a NA--NA NA--NA
6 a NA--NA NA--NA
7 a NA--NA NA--NA
8 a NA--NA NA--NA
9 a NA--NA NA--NA
I think this is unexpected, but I might be missing something? Or is it a bug?