I have a dataset with multiple IDs and dates where I have created a column for Cumulative supply in python.
My data is as follows
SKU Date Demand Supply Cum_Supply
1 20160207 6 2 2
1 20160214 5 0 2
1 20160221 1 0 2
1 20160228 6 0 2
1 20160306 1 0 2
1 20160313 101 0 2
1 20160320 1 0 2
1 20160327 1 0 2
2 20160207 0 0 0
2 20160214 0 0 0
2 20160221 2 0 0
2 20160228 2 0 0
2 20160306 2 0 0
2 20160313 1 0 0
2 20160320 1 0 0
2 20160327 1 0 0
Where Cum_supply was calculated by
idx = pd.MultiIndex.from_product([np.unique(data.Date), data.SKU.unique()])
data2 = data.set_index(['Date', 'SKU']).reindex(idx).fillna(0)
data2 = pd.concat([data2, data2.groupby(level=1).cumsum().add_prefix('Cum_')],1).sort_index(level=1).reset_index()
I want to create a Column 'True_Demand' which is max unfulfilled demand till that date max(Demand-Supply) + Cum_supply.
So my output would be something this:
SKU Date Demand Supply Cum_Supply True_Demand
1 20160207 6 2 2 6
1 20160214 5 0 2 7
1 20160221 1 0 2 7
1 20160228 6 0 2 8
1 20160306 1 0 2 8
1 20160313 101 0 2 103
1 20160320 1 0 2 103
1 20160327 1 0 2 103
2 20160207 0 0 0 0
2 20160214 0 0 0 0
2 20160221 2 0 0 2
2 20160228 2 0 0 2
2 20160306 2 0 0 2
2 20160313 1 0 0 2
2 20160320 1 0 0 2
2 20160327 1 0 0 2
So for the 3rd record(20160221) the max unfulfilled demand before 20160221 was 5. So the True demand is 5+2 = 7 despite the unfulfilled demand on that date was 1+2.
Code for the dataframe
data = pd.DataFrame({'SKU':[1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2],
'Date':[20160207,20160214,20160221,20160228,20160306,20160313,20160320,20160327,20160207,20160214,20160221,20160228,20160306,20160313,20160320,20160327],
'Demand':[6,5,1,6,1,101,1,1,0,0,2,2,2,1,1,1],
'Supply':[2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}
,columns=['Date', 'SKU', 'Demand', 'Supply'])