I am trying to match tables if a string is fully present in the other tables' column. However, I have managed to join it partially and then I am applying Levenstein distance to get close matches. This approach has limited use and accuracy. Approach:
checkg <- check %>%
fuzzy_inner_join(LOCATIONS, by = c("STRING" = "STRING"), match_fun = str_detect) %>%
rowwise() %>%
mutate(DIST = adist(x=STRING, y=LOCATION, ignore.case = TRUE))
is there any way to map it in the following way? The STATUS column in the output table is just given to make it clear that partial string matching is not the objective. It is not required in the output. Thanks
TABLE 1
**STRING**
BATANGAS
QINGDAO
TABLE2
**STRING**
BATNAGAS LUZON
QINGDAO PT
OUTPUT TABLE checkg
TABLE1.STRING TABLE2.STRING STATUS
BATANGAS BATNAGAS LUZON Accept
QINGDAO QINGDAO PT Accept
BATANGAS TANGA Reject