0

I have a Pandas dataframe containing tweets from the period July 24 2019 to 19 October 2019. I have applied the VADER sentiment analysis method to each tweet and added the sentiment scores in new columns.

Now, my hope was to visualize this in some kind of line chart in order to analyse how the averaged sentiment scores per day have changed over this three-months period. I therefore need the dates to be on the x-axis, and the averaged negative, positive and compound scores (three different lines) on the y-axis.

I have an idea that I need to somehow group or resample the data in order to show the aggregated sentiment value per day, but since my Python skills are still limited, I have not succeeded in finding a solution that works yet.

If anyone has an idea as to how I can proceed, that would be much appreciated! I have attached a picture of my dateframe as well as an example of the type of plot I had in mind :)

Cheers, Nicolai

enter image description here

enter image description here

Nicolai MC
  • 19
  • 1
  • 5

1 Answers1

0

You should have a look at the groupby() method: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html

Simply create a day column which contains a timestamp/datetime_object/dict/tuple/str ... which represents the day of the tweet and not it's exact time . Then use the groupby() method on this column. If you don't know how to create this column, an easy way of doing it is using https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html

Keep in mind that groupby method doesn't return a DataFrame but a groupby_generic.DataFrameGroupBy so you'll have to choose a way of aggreating the data in your groups (you should probably do groupby().mean() in your case, see grouby method documentation for more information)

TBER
  • 1
  • 1