2
import numpy as np
import pandas as pd
df=pd.read_excel('Finning2.xlsx',encoding='utf-8')
import nltk
nltk.download('vader_lexicon')
from nltk.sentiment.vader import SentimentIntensityAnalyzer
sid = SentimentIntensityAnalyzer()

 review = df['review']
 review = str(review).encode('utf-8')

 df['scores'] = df['review'].apply(lambda review:sid.polarity_scores(review))
Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
Mohamed Helmy
  • 21
  • 1
  • 2

3 Answers3

2

We need to convert review column into string before applying polarity_scores funtion

    df['score'] = df['review'].apply(lambda review:sid.polarity_scores(str(review)))
msrr
  • 21
  • 2
1

Try this (worked for me):

    import numpy as np
    import pandas as pd
    import nltk
    
    from nltk.sentiment.vader import SentimentIntensityAnalyzer

    df=pd.read_excel('Finning2.xlsx').astype(str)

    nltk.download('vader_lexicon')
    
    sid = SentimentIntensityAnalyzer()

    review = df['review']
    review = str(review).encode('utf-8')

    df['scores'] = df['review'].apply(lambda review:sid.polarity_scores(review))
0

I mocked up an example (shown below) but am not able to replicate the behavior you're seeing. Can you please show us how the dataframe is being formed or a sample of what the 'review' column looks like for your data?

dict = {"population": [200.4, 143.5, 1252, 1357, 52.98]}

import pandas as pd
df = pd.DataFrame(dict)

pop = str(df['population']).encode("utf-8")
print(pop)

And here is the output:

b'0     8.516\n1    17.100\n2     3.286\n3     9.597\n4     1.221\nName: area, dtype: float64'
jkovba
  • 1,229
  • 2
  • 11
  • 20