I want to auto-correct the words which are in my list
.
Say I have a list
kw = ['tiger','lion','elephant','black cat','dog']
I want to check if these words appeared in my sentence. If they are wrongly spelled I want to correct them. I don't intend to touch other words except from the given list.
Now I have list of str
s = ["I saw a tyger","There are 2 lyons","I mispelled Kat","bulldogs"]
Expected output:
['tiger','lion',None,'dog']
My Efforts:
import difflib
op = [difflib.get_close_matches(i,kw,cutoff=0.5) for i in s]
print(op)
My Output:
[[], [], [], ['dog']]
The problem with above code is I want to compare entire sentence and my kw
list can have more than 1 word(upto 4-5 words).
If I lower the cutoff
value it starts returning the words which is should not.
So even if I plan to create bigrams, trigrams from given sentence it would consume a lot of time.
So is there way to implement this?
I have explored few more libraries like autocorrect
, hunspell
etc. but no success.