0

I have a series of Company Names returned from a web scrape and I am trying to compare them against a table of other Company Names to see if they match or are a near match;

  • Some Company Ltd.
  • Another Company Limited
  • This Ltd.

  • Some Company Ltd.

  • Another Company Ltd.
  • That Limited

So comparing the two lists should flag the first two rows as matching, the second rows as near matching and the third row as not matching. From what I understand this is a Fuzzy Search but I was after some clarification on how best to go about achieving this? Any ideas or suggestions?

pnuts
  • 58,317
  • 11
  • 87
  • 139

1 Answers1

1

Please refer to this: http://en.wikipedia.org/wiki/Levenshtein_distance A C implementation can be found in External links: Levenshtein in MySQL

Frank He
  • 536
  • 3
  • 9