Output vader sentiment scores in columns based on dataframe rows of tweets

Question

I have a dataframe that contains rows of tweets and i would like to create 4 columns of the scores 'positive', 'negative', 'neutral' and 'compound' based on the content of each row using vader sentiment analysis.

I looked up different posts but i couldnt figure it out for my exact case. Thank you in advance!

score 11 · Accepted Answer · answered May 05 '20 at 08:50

11

I actually found a simple solution to do it through list comprehensions for anyone facing the same problem:

analyzer = SentimentIntensityAnalyzer()
df['compound'] = [analyzer.polarity_scores(x)['compound'] for x in df['tweet']]
df['neg'] = [analyzer.polarity_scores(x)['neg'] for x in df['tweet']]
df['neu'] = [analyzer.polarity_scores(x)['neu'] for x in df['tweet']]
df['pos'] = [analyzer.polarity_scores(x)['pos'] for x in df['tweet']]

answered May 05 '20 at 08:50

Specter07

201
4
12

Thank you for your easy to use solution. – purplecollar Aug 12 '20 at 01:36
Thanks, exactly what I was looking for. – Andrej Oct 10 '20 at 15:13
This is simple and it works, but it means running the analyzer 4 times, getting back complete answers each time. Any way to take the dict that is returned and assign it to a set of new columns? – ViennaMike Sep 22 '22 at 01:38

score 2 · Answer 2 · answered May 05 '20 at 07:53

2

Something like this should work:

analyzer = SentimentIntensityAnalyzer()
df['rating'] = df['tweets'].apply(analyzer.polarity_scores)
pd.concat([df.drop(['rating'], axis=1), df['rating'].apply(pd.Series)], axis=1)

answered May 05 '20 at 07:53

luigigi

4,146
1
13
30

score 2 · Answer 3 · answered May 08 '20 at 16:44

I have done same type of work using Vader for sentiment analysis in python 3. Take a look you may find a way of how it possible to perform what you need.

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import time
analyzer = SentimentIntensityAnalyzer()

pos_count = 0
pos_correct = 0

with open("D:/Corona_Vac/pythonprogramnet/Positive BOW.txt","r") as f:
    for line in f.read().split('\n'):
        vs = analyzer.polarity_scores(line)
        if not vs['neg'] > 0.1:
            if vs['pos']-vs['neg'] > 0:
                pos_correct += 1
            pos_count +=1


neg_count = 0
neg_correct = 0

with open("D:/Corona_Vac/pythonprogramnet/Positive BOW.txt","r") as f:
    for line in f.read().split('\n'):
        vs = analyzer.polarity_scores(line)
        if not vs['pos'] > 0.1:
            if vs['pos']-vs['neg'] <= 0:
                neg_correct += 1
            neg_count +=1

print("Positive accuracy = {}% via {} samples".format(pos_correct/pos_count*100.0, pos_count))
print("Negative accuracy = {}% via {} samples".format(neg_correct/neg_count*100.0, neg_count))

Hope you may fix. Thanks

Output vader sentiment scores in columns based on dataframe rows of tweets

3 Answers3