I am trying to match Cell Phone Tower IDs contained in one table with a master table of locations(in lat long) of Cell Phone Tower IDs. The format of IDs in the locations table are different from the ones in the first table and I am trying to use agrep()
to do a fuzzy match. To give you an example, let's say the ID I am trying to match is:
x <- c("405-800-125-39883")
A sample of IDs located in the locations table:
y <- c("405-810-1802-19883", "405-810-2101-29883", "405-810-1401-31883",
"405-810-5005-49883","125-39883","405-810-660-39883")
I am then using agrep()
with different combinations of max.distance
:
agrep(x,y,max.distance=0.3,value=TRUE)
This returns:
[1] "405-810-1802-19883" "405-810-2101-29883" "405-810-1401-31883" "405-810-5005-49883"
[5] "405-810-660-39883"
Whereas the value that I am really after is "125-39883"
I have also tried the stringdist_join()
function from the stringdist
package and applied to the two data frames bby varying max_dist
but with no success. Basically what I am looking for is a perfect match after the last hyphen and then macth on the number on the second last hyphen and so on. Is there any way of doing that?