Is it possible to perform some type of fuzzy matching without having a table of desired results?
For example, standardising these rows:
Lord Philip Harris
Lord Harris of Peckham
Lord Philip C. Harris
Philip Lord C Harris
Lord Phillip Harris of Peckham
into the first value, the last value or something like 'Philip Harris' - it doesn't really matter which.
I've used the stringdist package in R to append a column to a dataframe containing the value that's the minimum distance from the test but to do this, I had a list of 'correct' data.
Thanks