How can I get the same result I`m getting on pandas on DASK?
The objective is to have a uniform time interval for each group, replicating the last value until we have a new one.
import pandas as pd import numpy as np import datetime
data=pd.DataFrame([["AAAA","2020-01-15",2],
["AAAA","2020-02-15",9],
["AAAA","2020-02-20",2],
["AAAA","2020-02-25",9],
["AAAA","2020-04-18",2],
["BBBB","2020-01-01",5],
["BBBB","2020-02-15",5],
["BBBB","2020-02-20",4],
["BBBB","2020-02-25",4],
["BBBB","2020-04-15",2],
["CCCC","2020-01-01",9],
["CCCC","2020-02-15",5],
["CCCC","2020-03-20",7],
["CCCC","2020-04-25",4],
["CCCC","2020-05-15",2]])
data.columns=['Asset','Date','P']
data['Date']=pd.to_datetime(data['Date'])
data.index=data['Date'].values
temp=data.groupby('Asset').resample('2D').pad()
temp
** this is just an example, the real-world application is really big.
Thanks!