0

Is it possible to perform some type of fuzzy matching without having a table of desired results?

For example, standardising these rows:

Lord Philip Harris
Lord Harris of Peckham
Lord Philip C. Harris
Philip Lord C Harris
Lord Phillip Harris of Peckham

into the first value, the last value or something like 'Philip Harris' - it doesn't really matter which.

I've used the stringdist package in R to append a column to a dataframe containing the value that's the minimum distance from the test but to do this, I had a list of 'correct' data.

Thanks

deethreenovice
  • 127
  • 1
  • 2
  • 17

0 Answers0