Recently I had to join two dataframes based on their timestamps. The left data contains a fixed timestamp and the right a range. I got it mostly working as you can see in my MWE, but the system tends to produce duplicate results at the crossing point from one range to the next. I've tried all the options, nothing worked.
Is there a nice way to suppress the duplicate entry?
In this example it is the bold one, number 13.
Of course you can try to filter it, but that feels rather hacky.
left.data <- data.frame(
Data.left = 1:20,
Timestamp = seq(
ISOdatetime(2000, 1, 1, 10, 55, 10),
by = "5 min",
length.out = 20
)
)
right.data <- data.frame(
Data.right = 1:10,
PeriodStartTime = seq(
ISOdatetime(2000, 1, 1, 10, 55, 10),
by = "hour",
length.out = 10
),
PeriodEndTime = seq(
ISOdatetime(2000, 1, 1, 11, 55, 10),
by = "hour",
length.out = 10
)
)
result <- fuzzyjoin::interval_join(
x = left.data,
y = right.data,
by = c(
"Timestamp" = "PeriodStartTime",
"Timestamp" = "PeriodEndTime"
),
mode = "left"
)
Data.left | Timestamp | Data.right | PeriodStartTime | PeriodEndTime |
---|---|---|---|---|
1 | 2000-01-01 10:55:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
2 | 2000-01-01 11:00:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
3 | 2000-01-01 11:05:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
4 | 2000-01-01 11:10:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
5 | 2000-01-01 11:15:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
6 | 2000-01-01 11:20:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
7 | 2000-01-01 11:25:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
8 | 2000-01-01 11:30:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
9 | 2000-01-01 11:35:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
10 | 2000-01-01 11:40:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
11 | 2000-01-01 11:45:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
12 | 2000-01-01 11:50:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
13 | 2000-01-01 11:55:10 | 1 | 2000-01-01 10:55:10 | 2000-01-01 11:55:10 |
13 | 2000-01-01 11:55:10 | 2 | 2000-01-01 11:55:10 | 2000-01-01 12:55:10 |
14 | 2000-01-01 12:00:10 | 2 | 2000-01-01 11:55:10 | 2000-01-01 12:55:10 |
15 | 2000-01-01 12:05:10 | 2 | 2000-01-01 11:55:10 | 2000-01-01 12:55:10 |
16 | 2000-01-01 12:10:10 | 2 | 2000-01-01 11:55:10 | 2000-01-01 12:55:10 |
17 | 2000-01-01 12:15:10 | 2 | 2000-01-01 11:55:10 | 2000-01-01 12:55:10 |
18 | 2000-01-01 12:20:10 | 2 | 2000-01-01 11:55:10 | 2000-01-01 12:55:10 |
19 | 2000-01-01 12:25:10 | 2 | 2000-01-01 11:55:10 | 2000-01-01 12:55:10 |
20 | 2000-01-01 12:30:10 | 2 | 2000-01-01 11:55:10 | 2000-01-01 12:55:10 |