-2

Is there a spark library to find a phonetic match for a word in a sentence. For ex : Sentence : “There were to people on the scooter” .

Word to find : “two”.

Since to and two are phonetically similar it should find a positive match for the word “to” in the sentence.

I found an elasticsearch phonetic filter which does something similar. But I’m looking for a way to do it a spark streaming cluster.

  • You should be looking for a Java/Python library, not a Spark specific one. Maybe CoreNLP, for example. And Elasticsearch uses Lucene, so could look there – OneCricketeer Nov 22 '19 at 07:14
  • Googling "spark phonetic match" revealed some resources, one of them you'll find below in my answer – Alex Nov 29 '19 at 09:23

1 Answers1

0

It seems like this post describes exactly what you're looking for: https://medium.com/@mrpowers/fuzzy-matching-in-spark-with-soundex-and-levenshtein-distance-6749f5af8f28

Alex
  • 347
  • 3
  • 13