2

Im a beginner and currently doing a sentiment analysis on tweets using snscrape. Here is the code I used:

sentiment_df = pd.DataFrame()

for post in tqdm(df):
    polarity = getPolarityScore(post)
    sentiment = getSentiment(polarity)
    sentiment_df = sentiment_df.append(pd.Series([round(polarity, 2), sentiment, post]), ignore_index=True)


sentiment_df.columns = ['Tweet_Polarity', 'Tweet_Sentiment', 'Tweet']
sentiment_df.head(10)

Error:

 0%|                                                                                          | 0/101 [00:00<?, ?it/s]C:\Users\m\AppData\Local\Temp\ipykernel_14332\2796172053.py:6: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  sentiment_df = sentiment_df.append(pd.Series([round(polarity, 2), sentiment, post]), ignore_index=True)
  1%|▊                                                                                | 1/101 [00:00<00:00, 500.51it/s]

I have no clue how to rewrite it with .concat method, could someone help? Many thanks.

I tried to use this code:

sentiment_df = pd.DataFrame()

for post in tqdm(df):
    polarity = getPolarityScore(post)
    sentiment = getSentiment(polarity)
    row = pd.Series([round(polarity, 2), sentiment, post], index=['Tweet_Polarity', 'Tweet_Sentiment', 'Tweet'])
    sentiment_df = pd.concat([sentiment_df, row.to_frame().T], ignore_index=True)

sentiment_df.head(10)

but it gave an error too: 1%|▊ | 1/101 [00:00<00:00, 496.02it/s]

  • 1
    It reads to me as a `FutureWarning` and not an error per se (warning can be ignored). Do you actually get an exception and the program halts? To suppress warnings check [this related answer of mine](https://stackoverflow.com/a/44933731/3908170) – DarkCygnus Mar 12 '23 at 22:47
  • 1
    Although it is possible to loop round concatenating each row to the DF, it is preferable and more efficient to use the loop to collect the data in Lists then finally form these into column data and concat with the DF. See [link](https://stackoverflow.com/questions/71258548) and [link](https://stackoverflow.com/questions/51570405) – user19077881 Mar 12 '23 at 23:33
  • 1
    If your goal is just to create a dataframe, I don't think you should bother with `concat`. I recommend you use [`pandas.DataFram.from_dict`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.from_dict.html) to build the dataframe. Building a dict directly in Python is easier and more intuitive. – Fanchen Bao Mar 13 '23 at 05:26

1 Answers1

1

You can use pd.concat() this way

import pandas as pd
from tqdm import tqdm

def getPolarityScore(text):
    return 0.5

def getSentiment(score):
    if score > 0:
        return 'positive'
    elif score < 0:
        return 'negative'
    else:
        return 'neutral'

tweets = ['This is a positive tweet!',
          'This is a negative tweet :(',
          'This tweet has neutral sentiment.',
          'Another positive tweet :)']

sentiment_df = pd.DataFrame()

for post in tqdm(tweets):
    polarity = getPolarityScore(post)
    sentiment = getSentiment(polarity)
    row = pd.Series([round(polarity, 2), sentiment, post], index=['Tweet_Polarity', 'Tweet_Sentiment', 'Tweet'])
    sentiment_df = pd.concat([sentiment_df, row.to_frame().T])

sentiment_df.reset_index(drop=True, inplace=True)
print(sentiment_df.head())

which gives

100%|██████████| 4/4 [00:00<00:00, 644.26it/s]
  Tweet_Polarity Tweet_Sentiment                              Tweet
0            0.5        positive          This is a positive tweet!
1            0.5        positive        This is a negative tweet :(
2            0.5        positive  This tweet has neutral sentiment.
3            0.5        positive          Another positive tweet :)