First of all, I am extremely new to Python (and programming in general),
I am looking to write a program to detect whether two specific syllables (strings) rhyme phonetically. I have already tried the module "pronouncing", but it generally only checks for perfect rhymes (e.g. "cat" and "hat"). However, rhymes based on phonetics such as "poor" and "pour" or "poor" and "tour" aren't detected.
I have written the following program, which essentially takes a series of words as input, syllabifies them, arranges these syllables in an array as both the rows and columns, and "cross-checks" whether every entry in the array rhymes according to the module "pronouncing"; returning 1 if that is the case and 0 if not.
So for instance, the input
cat hat man
will output the array
1 1 0
1 1 0
0 0 1
Here's the code as of now:
from hyphen import Hyphenator
import pronouncing
import numpy as np
h_en = Hyphenator('en_US')
P = [str(x) for x in input().split()]
Q = []
for i in range(0,len(P)):
if h_en.syllables(str(P[i])) == []:
Q = Q + [P[i]]
else:
Q = Q + h_en.syllables(str(P[i]))
print(Q)
S = []
for i in range(0,len(Q)):
for j in range(0,len(Q)):
if str(Q[j]) in pronouncing.rhymes(str(Q[i]))+[Q[i]]:
S = S + [1]
else:
S = S + [0]
print(S)
a = np.transpose([S[x:x+len(Q)] for x in range(0, len(S), len(Q))])
print()
print(a)
The reasoning for outputting an array is irrelevant in this context, as I am only looking to optimise the part of the program that checks for rhymes.
How would you go about this problem? How would you write a program that can detect phonetic rhymes without too many false positives?