2

I have a dataframe that looks like this:

     Text
0    this is amazing
1    nan
2    wow you are great

I want to iterate every word in a cell of the dataframe into textblob to get the polarity in a new column. However many rows have nan in them.

I think this is causing TextBlob to implement scores of 0.0 for polarity in the new column for all rows even those with text in them.

How do I run TextBlob.sentiment.polarity over every text in my column and create a new column with the polarity scores?

New df should look like this:

     Text                 sentiment
0    this is amazing      0.9
1    nan                  0.0
2    wow you are great    0.8

I dont care about the nan so the sentiment value can be nan or 0.

Current code that is not working:

for text in df.columns:
    a = TextBlob(text)
    df['sentiment']=a.sentiment.polarity
    print(df.value)

Thank you in advance.

edit:

To add, not sure if this makes a difference, the index on the df is not reset, for the fact that other parts of df are grouped together by the same index number.

RustyShackleford
  • 3,462
  • 9
  • 40
  • 81

3 Answers3

2

try this:

>>> s=pd.Series(['this is amazing',np.NaN,'wow you are great'],name='Text')
>>> s
Out[100]: 
0      this is amazing
1                  NaN
2    wow you are great
Name: Text, dtype: object

>>> s.apply(lambda x: np.NaN if pd.isnull(x) else TextBlob(x).sentiment.polarity)
Out[101]: 
0    0.60
1     NaN
2    0.45
Name: Text, dtype: float64
hmad
  • 159
  • 8
1

Another solution:

d = {'text': ['text1', 'text2', 'text3', 'text4', 'text5'], 'desc': ['The weather is nice today in my city.', 'I hate this weather.', 'Nice weather today.', 'Perfect weather today.', np.NaN]}
df = pd.DataFrame(data=d)
print(df)

    text                                   desc
0  text1  The weather is nice today in my city.
1  text2                   I hate this weather.
2  text3                    Nice weather today.
3  text4                 Perfect weather today.
4  text5                                    NaN

Applying sentiment analysis with TextBlob and add the result to a new column:

df['sentiment'] = df['desc'].apply(lambda x: 'NaN' if pd.isnull(x) else TextBlob(x).sentiment.polarity)
print(df)

    text                                   desc sentiment
0  text1  The weather is nice today in my city.       0.6
1  text2                   I hate this weather.      -0.8
2  text3                    Nice weather today.       0.6
3  text4                 Perfect weather today.         1
4  text5                                    NaN       NaN
larkee
  • 540
  • 8
  • 16
0

If you have a problem with nan, you can apply your function to rows without nan in the column Text such as:

mask = df['Text'].notnull() #select the rows without nan
df.loc[mask,'sentiment'] = df.loc[mask,'Text'].apply(lambda x: TextBlob(x).sentiment.polarity)

Note: I don't have TextBlob so I assume from your code that TextBlob(x).sentiment.polarity would works.

Ben.T
  • 29,160
  • 6
  • 32
  • 54