Split column values to several in pandas dataframe

Question

I am trying to do sentiment analysis on tweets using sentimentIntensityAnalyzer() from nltk.sentiment.vader

sid = SentimentIntensityAnalyzer()


listy = []

for index, row in data.iterrows():
  ss = sid.polarity_scores(row["Tweets"])
  listy.append(ss)

se = pd.Series(listy)
data['polarity'] = se.values

display(data.head(100))

This is the resulting dataFramee :

    Tweets  polarity
0   RT @spectatorindex: Facebook controls:\n\n- Wh...   {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound...
1   RT @YAATeamWest: Today we're at @BradfordUniSU...   {'neg': 0.0, 'neu': 0.902, 'pos': 0.098, 'comp...
2   #SachinTendulkar launches India’s first Multip...   {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound...
3   How To Create a 360 Render (And How to Improv...   {'neg': 0.0, 'neu': 0.722, 'pos': 0.278, 'comp...
4   The Most Disturbing Virtual Reality You Will E...   {'neg': 0.174, 'neu': 0.826, 'pos': 0.0, 'comp...
5   VR Training for Troops \n\n...    {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound...
6   RT @DefenceHQ: The @BritishArmy has awarded a ...   {'neg': 0.0, 'neu': 0.847, 'pos': 0.153, 'comp...
7   RT @UofGHumanities: @UofGCSPE Humanities Lectu...   {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound...
8   RT @OyezServices: Ever wanted a tour of Machu ...   {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound...
9   RT @ProjectDastaan: We are an Oxford Universit...   {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound...
10  RT @Paula_Piccard: Virtual reality will change...   {'neg': 0.0, 'neu': 0.878, 'pos': 0.122, 'comp...

In order to do statistical analysis on the 'neg','pos','neu' and 'compound' entities in the polarity column I wanted to split the data into four different columns. To achieve this I used :

list_pos= []
list_neg = []
list_comp = []
list_neu = []
for index, row in data.iterrows():
  list_pos.append(row['polarity']['pos'])
  list_neg.append(row['polarity']['neg'])
  list_comp.append(row['polarity']['compound'])
  list_neu.append(row['polarity']['neu'])
se_pos = pd.Series(list_pos)
se_neg = pd.Series(list_neg)
se_comp = pd.Series(list_comp)
se_neu = pd.Series(list_neu)
data['positive'] = se_pos.values
data['negative'] = se_neg.values
data['compound'] = se_comp.values
data['neutral'] = se_neu.values

The resulting dataFrame:


Tweets  polarity    positive    negative    compound    neutral
0   RT @spectatorindex: Facebook controls:\n\n- Wh...   {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound...   0.000   0.000   0.0000  1.000
1   RT @YAATeamWest: Today we're at @BradfordUniSU...   {'neg': 0.0, 'neu': 0.902, 'pos': 0.098, 'comp...   0.098   0.000   0.3612  0.902
2   #SachinTendulkar launches India’s first Multip...   {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound...   0.000   0.000   0.0000  1.000

Is there a more concise way of achieving a similar dataFrame? Using the lambda function perhaps? Thanks for the help!

How are you storing a raw dict in a dataframe to begin with? I tried to reproduce and couldn't, and [this](https://github.com/pandas-dev/pandas/issues/17777) claims it isn't possible — Josh Friedlander, Feb 04 '19 at 16:27
@JoshFriedlander I just changed the list to a pandas Series object and then passed its values into the dataFrame. Here is the google Colab [link](https://colab.research.google.com/drive/1qwTdQHnAriMkA8NeYksIz21EX7lEJUsV) — Yashwardhan kaul, Feb 04 '19 at 17:29

Split column values to several in pandas dataframe

0 Answers0