0

I'm looking for any libraries that can help to match two words with misspelling. For instance, the gem should mark the following statements as true (it's just an example, not necessary to have standard strings extended)

'Start' == 'Strat'
'woodpecker' == 'Wodpekcer'

Any ruby gems for data quality checking?

Misha Slyusarev
  • 1,353
  • 2
  • 18
  • 45
  • If there were any gem that does that, then it means that it would change the definition of `String#==`, which would most likely mess up most Ruby programs. – sawa Dec 20 '13 at 14:17
  • I'm not saying it must look like this. It just should compare two strings and say they equal. – Misha Slyusarev Dec 20 '13 at 14:33
  • @BeatRichartz how to measure distance between words was the first thing to find out, but if you can suggest anything else for data quality checking it would be great. Soundex? Anything else? – Misha Slyusarev Dec 20 '13 at 14:38

2 Answers2

2

As you stated that you are looking for libraries/gems, here are some gems implementing string distance and fuzzy matching:

The libraries do not extend core classes, so you would not be able to compare the strings using the == operator, but you can calculate their similarity and find similar strings.

For Soundex, Metaphone and similar, you can use the wonderful text gem. It may be a bit more involving using phonetic algorithms, as they may work better or worse depending on the language. What works perfectly for English might not work for other languages.

kostja
  • 60,521
  • 48
  • 179
  • 224
0

You know about Levenshtein?

https://github.com/anjlab/rubyfish is just one gem you can install

antinome
  • 3,408
  • 28
  • 26
devanand
  • 5,116
  • 2
  • 20
  • 19