The following data has the surprising result that it does not match. I was expecting the distance to be 5
, but even at 7
I get no match
library(fuzzyjoin)
one <- as.data.frame("Other field crops (non-organic)")
names(one) <- "A"
two <- as.data.frame("other_field_crops_non_organic")
names(two) <- "A"
stringdist_left_join(one, two, by = "A", method = "lcs", max_dist = 7, ignore_case=TRUE)
A.x A.y
1 Other field crops (non-organic) <NA>
Only at 10
I get a match..
stringdist_left_join(one, two, by = "A", method = "lcs", max_dist = 10, ignore_case=TRUE)
A.x A.y
1 Other field crops (non-organic) other_field_crops_non_organic
Could someone explain to me why this distance larger than 9
? Does it have to do with the brackets? And if so how can I circumvent this issue without removing the brackets?
EDIT
library(fuzzyjoin)
one <- as.data.frame("Other field crops non-organic")
names(one) <- "A"
two <- as.data.frame("other_field_crops_non_organic")
names(two) <- "A"
stringdist_left_join(one, two, by = "A", method = "lcs", max_dist = 5, ignore_case=TRUE)
A.x A.y
1 Other field crops non-organic <NA>
Even without the brackets I cannot get the distance within 5
.