Is it possible to exclude certain words from the dictionary when using PyEnchant? For example, I want to check if a word is english ('en_EN'
in my case) or french ('fr_FR'
). However, when I check on the string "de
" against both dictionaries both return true.
Asked
Active
Viewed 448 times
1 Answers
0
You may try to remove stop words before passing to Pyenchant
from nltk.corpus import stopwords
def remove_stop_words(self, tokenized_docs_no_punctuation):
"""
:param tokenized_docs_no_punctuation:
:return:
"""
# print 'CleanupText.remove_stop_words()'
tokenized_docs_no_stopwords = []
for token in tokenized_docs_no_punctuation:
if not token in stopwords.words('english'):
tokenized_docs_no_stopwords.append(token)
return tokenized_docs_no_stopwords
Then those tokens pass them to Pyenchant

gogasca
- 9,283
- 6
- 80
- 125