I've a text file full of common misspellings and their corrections.
All misspellings, of the same intended word, should be on the same line.
I do have this somewhat done, but not for all misspellings of the same word.
misspellings_corpus.txt
(snippet):
I'de->I'd
aple->apple
appl->apple
I'ed, I'ld, Id->I'd
Desired:
I'de, I'ed, I'ld, Id->I'd
aple, appl->apple
template: wrong1, wrong2, wrongN->correct
Attempt:
lines = []
with open('/content/drive/MyDrive/Colab Notebooks/misspellings_corpus.txt', 'r') as fin:
lines = fin.readlines()
for this_idx, this_line in enumerate(lines):
for comparison_idx, comparison_line in enumerate(lines):
if this_idx != comparison_idx:
if this_line.split('->')[1].strip() == comparison_line.split('->')[1].strip():
#...
correct_words = [l.split('->')[1].strip() for l in lines]
correct_words