11

Was wondering if anyone had any tips or could point me in the right direction to finding/creating some sort of algorithm to find rhyming words.

I specifically do not want to use an API, as creating the algorithm just to create it is my end goal.

Not that it should be important but I'm coding in java.

Thank you

Gagan Singh
  • 988
  • 12
  • 22

5 Answers5

8

This seems like it could be a huge project if you don't want to use an API. The challenging step would be to determine the phonetics of a word (two words rhyme if their endings are phonetically similar). If you can do this, you can compare the endings of their pronunciation. You could possibly find an API that would convert known words to their phonetic spellings but if you don't want to use APIs you have to do it yourself and it's no small task... not to mention, hasn't been perfect by anyone.

The other method would be to research the Metaphone algorithm, explained here: http://www.blackbeltcoder.com/Articles/algorithms/phonetic-string-comparison-with-soundex

Foggzie
  • 9,691
  • 1
  • 31
  • 48
  • Soundex was developed specifically for North American family names during the processing of their census. It is not a general purpose library for phonetic analysis of words, though it is still probably of interest to the asker. It's also focussed on the start of words, whereas rhymes relate to their endings. The technique may be malleable. – Drew Noakes Dec 19 '12 at 21:33
  • I provided that link because of the Metaphone algorithm, not Soundex. – Foggzie Dec 19 '12 at 21:37
6

Best algorithm will use a dictionary of words classified on groups with rhymes. It's very hard problem and need linguistics background. I suppose you want some, probably not the best, algorithm for automatic finding the rhymes.

Basic idea to code pronunciation of the word (not the word itself) with some value. And values that ends with equal codes identify words rhymes.

From my perspective it is more researching than finding the correct algorithm.

Take a look at that paper: A System for the Automatic Identification of Rhymes

mishadoff
  • 10,719
  • 2
  • 33
  • 55
4

I think leveraging a standard phonetic algorithm would be a good idea. I think Soundex might be a bit limited, but a double metaphone would probably be a good choice.

Get the metaphone representations of the words in question, remove the first characters, and check whether the remaining portion of the shorter of the two words matches the end of the longer. With double metaphone, it's very similar, but make four comparisons, primary to primary, secondary to primary, primary to secondary and secondary to secondary.

I think that would be a good starting point.

A note on this and many other phonetic algorithms: It isn't designed to provide precise phonetic definition. Varied geographic pronunciation, common mispronunciations and alternate pronunciations make a hard and fast single correct pronunciation impossible to obtain based solely on the word. Novel spelling and letter usage make it hard to algorithmically obtain a close pronunciation (care for some hors d'oeuvres?). Also, a major goal of many such algorithms are to match similar sounding or misheard words or names to each other, so the results are usually intended to be a bit imprecise (this is probably a good thing, for this purpose as well).

femtoRgon
  • 32,893
  • 7
  • 60
  • 87
4

I wrote a rhyming dictionary program at my blog. The idea is to use a dictionary with pronunciations and compare phonemes starting from the end; two words with the same ending phonemes are rhymes for each other.

user448810
  • 17,381
  • 4
  • 34
  • 59
0

You may want to take a look at the Carnegie Mellon pronouncing dictionary, for starters. It's the best pronouncing dictionary I've been able to find.

http://www.speech.cs.cmu.edu/cgi-bin/cmudict

charliegreen
  • 494
  • 1
  • 5
  • 11