confused about multi_by and multi_match_fun in R fuzzy_join

Asked Sep 14 '22 at 10:55

Active Sep 14 '22 at 10:55

Viewed 109 times

Can someone help me understand what "multi_by" and "multi_match_fun" actually do in comparison to "by" and "match_fun" in the R package fuzzyjoin? This is from the package docs (v0.1.6)

by                  Columns of each to join
match_fun           Vectorized function given two columns, returning TRUE or FALSE as to whether they are a match. Can be a list of functions one for each pair of columns specified in by (if a named list, it uses the names in x). If only one function is given it is used on all column pairs.
multi_by            Columns to join, where all columns will be used to test matches together
multi_match_fun     Function to use for testing matches, performed on all columns in each data frame simultaneously

I don't get how the two actually differ? Aren't all columns used to test matches anyway, even with "by/match_fun", i.e. am I wrong in thinking that a join/match is only returned if ALL columns match with the functions provided by match_fun?

asked Sep 14 '22 at 10:55

tospo

confused about multi_by and multi_match_fun in R fuzzy_join

0 Answers0