I want to join two tables A & B by ID and find in B the most recent date that is anterior to A[date].
After some search it seems that fuzzyjoin
allow to join on date ranges :
library(fuzzyjoin)
fuzzy_left_join(A, B,
by = c("ID" = "ID",
"date" = "date"),
match_fun = list("==","<"))
Problem is that this will return many records (if they exist in B), when I just want the most recent.
Any idea on how to proceed ?
EDIT :
A <- data.frame(ID=c(1,2,3),
date = c('2019-04-03','2019-05-13','2019-05-27'))
B <- data.frame(ID=c(1,1,2,3,4),
date = c('2018-01-01','2019-01-01','2019-02-20','2019-06-01','2019-01-01'),
value = c(1,1.5,1.2,3.7,4))
> A
ID date
1 1 2019-04-03
2 2 2019-05-13
3 3 2019-05-27
> B
ID date value
1 1 2018-01-01 1.0
2 1 2019-01-01 1.5
3 2 2019-02-20 1.2
4 3 2019-06-01 3.7
5 4 2019-01-01 4.0
Expected output :
ID date value
1 1 2019-04-03 1.5
2 2 2019-05-13 1.2
3 3 2019-05-27 NA