I am trying to plot
a daily follower count
for various twitter handles
. The result being something like what you see below, but filterable by more than 1 twitter handle:
Usually, I would do this by simply appending a new dataset pulled from Twitter to the original table, with the date of the log being pulled. However, this would make me end up with a million lines in just a few days. And it wouldn't allow me to clearly see when a user has dropped off.
As an alternative
, after pulling my data from Twitter, I structured my pandas dataframe
like this:
Follower_ID Handles Start_Date End_Date
100 x 30/05/2017 NaN
101 x 21/04/2017 29/05/2017
201 y 14/06/2017 NaN
100 y 16/06/2017 28/06/2017
Where:
Handles:
are the accounts I am pulling the Followers forFollower_ID:
is the user following an handle
So, for example, if I wereFollower_ID 100
, I could follow both handle x
and handle y
I am wondering what would be the best way to prepare the data (pivot
, clean through a function
, groupby
) so that then it can be plotted accordingly. Any ideas?