I am trying to run this code :
main_df %>%
fuzzy_anti_join(secondary_df, match_fun = list(`==`, `%within%`),
by = c("ID","Date" = "Date_Interval"))
the issue is that it returns the following error : Error in
dplyr::group_by(): ! Must group by variables found in
.data. ✖ Column
col is not found.
I already know why this is happening. The "Date" column is in the right table "secondary_df" and the "Date_Interval" is in the left table "main_df". So it is not finding "Date" on the left side and vice-versa.
However, i need to keep "main_df" as the left table and "secondary_df" as the right table. i obviously cannot simply switch my join variables like so : by = c("ID","Date_Interval"= "Date")
because that would defeat the purpose and I want to match where Date is within the Date interval.
I have also tried this :
test_df <- main_df %>%
fuzzy_anti_join(secondary_df,match_fun = list(`==`, `.y %within% .x`),
by = c("ID","Date" = "Date_Interval"))
but it does not handle the match_fun correctly. I still have a feeling that there is a way to fix it by changing the %within%
part of the match fun to switch the tables sides but i have not found it yet.
Please help!