I am trying to find the most suitable method (that I could use in VBA) in order to compare thousands of records from column A to data in column B.
The example of the data can be seen below:
Column1 Column2
Modra Digest (DC) Oldstewart2
South West Local /Sunday Times (new) Oldstewart
OldStewart political print Saigon Last month Saigon Last month
Oldstewart2 Local print (Former) Modra Digest Velehrad Digest (used via Bembek)
Saigon Last month South West Local South West Local /Sunday Times
This is what the table looks like in Excel:
So far, I have tried the Fuzzy lookup (Excel add in) but it doesn't do the job exactly right. I have also tried the Levenshtein Distance method, which seems to work relatively fine once few tweaks are implemented (trim, removal of brackets, de-duplication of words in the string), however, the challenge appears when there is only one matching word in the string (despite it being a "keyword" - please see the "oldstewart" example above), as this method compares the number of changes that need to take place (edit, deletion, replacement etc.) in order to match a string to one another.
The desired result would ideally look somewhat like this, as you can see the matching percentage is lower for the above mentioned examples due to stipulated reasons. (Sorry about the pics, but I am struggling post the table in the same structure as I managed in the original post.*)
The desired output:
*Kindly note that this is not a cross-post, as I was advised to break this project into smaller pieces for better convenience. The broader picture regarding what I'm trying to achieve (incl. the function defined for the Levenshtein distance method) can be found here: String matching in VBA using a predefined function.
Any advice regarding the methodology etc. is highly appreciated - as someone may have dealt with something similar in the past.
Thank you VERY much for your help!