Detecting whether two words rhyme phonetically using Python

Question

First of all, I am extremely new to Python (and programming in general),

I am looking to write a program to detect whether two specific syllables (strings) rhyme phonetically. I have already tried the module "pronouncing", but it generally only checks for perfect rhymes (e.g. "cat" and "hat"). However, rhymes based on phonetics such as "poor" and "pour" or "poor" and "tour" aren't detected.

I have written the following program, which essentially takes a series of words as input, syllabifies them, arranges these syllables in an array as both the rows and columns, and "cross-checks" whether every entry in the array rhymes according to the module "pronouncing"; returning 1 if that is the case and 0 if not.

So for instance, the input

cat hat man

will output the array

1 1 0
1 1 0
0 0 1

Here's the code as of now:

from hyphen import Hyphenator
import pronouncing
import numpy as np

h_en = Hyphenator('en_US')

P = [str(x) for x in input().split()]
Q = []
for i in range(0,len(P)):
    if h_en.syllables(str(P[i])) == []:
        Q = Q + [P[i]]
    else:
        Q = Q + h_en.syllables(str(P[i]))

print(Q)
S = []
for i in range(0,len(Q)):
    for j in range(0,len(Q)):
        if str(Q[j]) in pronouncing.rhymes(str(Q[i]))+[Q[i]]:
            S = S + [1]
        else:
            S = S + [0]
print(S)
a = np.transpose([S[x:x+len(Q)] for x in range(0, len(S), len(Q))])
print()
print(a)

The reasoning for outputting an array is irrelevant in this context, as I am only looking to optimise the part of the program that checks for rhymes.

How would you go about this problem? How would you write a program that can detect phonetic rhymes without too many false positives?

This is a pretty broad problem, not exactly the sort of thing that I'd be trying to tackle as a new programmer. That said, the sort of information you are after can be found by searching for 'natural language processing'. The python NLTK might also be of interest to you. — Matt, Oct 10 '17 at 09:27
Do you have a concept for an algorithm that might be able to achieve the goals of the program, perhaps written in pseudo-code? — Mikael Moesgaard, Oct 10 '17 at 09:55
Like I said above, this is not an easy problem. How exactly would you describe a rhyme? It's based on a word's pronunciation, not spelling, so you need a way of getting a phonetic representation of a word from it's spelling. Since the English language has so many idiosyncrasies, this is quite difficult. One approach is to just make a dictionary of words with their pronunciation (I believe `pronouncing` does this), but then what happens if the word isn't in your dictionary? Without simplifying the problem, there is no simple solution here (without using 3rd party code/libraries) — Matt, Oct 10 '17 at 10:08
You're absolutely right, it's not going to be a simple program. I will probably use a library; do you estimate that NLTK will be able to achieve the goal of the program? And if so, how? — Mikael Moesgaard, Oct 10 '17 at 10:52

Detecting whether two words rhyme phonetically using Python

0 Answers0