I have the following row dataset that is containing disparate dates.
Id Date Quantity
0 122222ST 2020-10-19 -1.0
1 122222ST 2020-10-22 75.0
2 122222ST 2020-10-22 1100.0
3 100128 2020-10-20 5.0
that I want to turn into a time serie for each Id such that every dates are here, if an Id has multiple for a date, they add up, and if there is none, it shows a 0. a minimal reproducible example would like this:
Date 122222ST 100128
2020-10-19 -1.0 0
2020-10-20 0 5
2020-10-21 0 0
2020-10-22 1175 0
I have been able to do this so far:
#read data
df = pd.read_csv('historical_data2.csv', sep=";")
#remove unwanted column
df = df.drop('NetAmount',axis=1)
#set date to datetime
df['Date'] = pd.to_datetime(df['Date'])
#get unique Ids and sum quantities per date
df = df.groupby(['Id', 'Date']).agg({'Quantity':'sum'}).reset_index()
#adding all missing dates
df = df.groupby(pd.Grouper(key='Date', freq='D'))
From this point on I am stuck because I cannot find a way to set Id values as the horizontal index. I have seen this example Python Pandas: How to set Dataframe Column value as X-axis labels but it regards plotting which I am interested in. Any clue?