0

enter image description here the code below should loop through a tweet dataset- text column and if a word not in stop words list, it should correct spelling, lemmatize, then stem the word. It is not working properly can you help me fix it? please check the error in the attached image

pstem = PorterStemmer()
lem = WordNetLemmatizer()
spell = SpellChecker()
stop_words = stopwords.words('english')

for i in range(len(df.index)):
    text = df.loc[i]['text']
    tokens = nltk.word_tokenize(text)
    tokens = [word for word in tokens if word not in stop_words] 
    for j in range(len(tokens)):
        tokens[j] = spell.correction(tokens[j])
        tokens[j] = lem.lemmatize(tokens[j])
        tokens[j] = pstem.stem(tokens[j])
    tokens_sent=' '.join(tokens)
    df.at[i,"text"] = tokens_sent 
Ghadah
  • 11
  • 4
  • In your above code the indentation is messed up but I'm guessing that is an artifact of copying the code here, not in the real code (your 2nd `for` needs to be indented one more and your last 2 lines need to be indented one less). What is the problem you are seeing. Please post the exact error message or problem behavior. – bivouac0 Nov 28 '19 at 23:13
  • Thank you for your reply. I fixed the indentation in the question and attached the image of the error I got after interrupting the code – Ghadah Nov 30 '19 at 10:38

0 Answers0