Some data
example_df <- data.frame(
url = c('blog/blah', 'blog/?utm_medium=foo', 'blah', 'subscription/apples', 'UK/something'),
numbs = 1:5
)
lookup_df <- data.frame(
string = c('blog', 'subscription', 'UK'),
group = c('blog', 'subs', 'UK')
)
library(fuzzyjoin)
data_combined <- example_df %>%
fuzzy_left_join(lookup_df, by = c("url" = "string"),
match_fun = `%in%`)
data_combined
url numbs string group
1 blog/blah 1 <NA> <NA>
2 blog/?utm_medium=foo 2 <NA> <NA>
3 blah 3 <NA> <NA>
4 subscription/apples 4 <NA> <NA>
5 UK/something 5 <NA> <NA>
I expected data_combined to have values for string and group where there's a match based on match_fun. Instead all NA.
Example, the first value of string in lookup_df is 'blog'. Since this is %in%
the first value of example_df string, expected a match with value 'blog' and 'blog' in string and group fields.