0

I have a dataset like below:

pd.DataFrame({'Date':['2019-01-01','2019-01-03','2019-01-01','2019-01-04','2019-01-01','2019-01-03'],'Name':['A','A','B','B','C','C'],'Open Price':[100,200,300,400,500,600],'Close Price':[200,300,400,500,600,700]})

Now we can see that we have few day entries missing in this table. i.e 2019-01-02 for A, and 2019-01-02, 2019-01-03 for B and 2019-01-02 for C.

What I'm looking to do is add dummy rows in the dataframe for these dates,

And close price column as the same of the next open price entry for next day. And I don't care the open price, it could be either nan or 0

Expected output

pd.DataFrame({'Date':['2019-01-01','2019-01-02','2019-01-03','2019-01-01','2019-01-02','2019-01-03','2019-01-04','2019-01-01','2019-01-02','2019-01-03'],'Name':['A','A','A','B','B','B','B','C','C','C'],'Open Price':[50,'nan',150,250,'nan','nan',350,450,'nan',550],'Close Price':[200,150,300,400,350,350,500,600,550,700]})

Any help would be appreciated !

1 Answers1

0

Your logic is fuzzy for how the prices should be interpolated, but to get you started, consider this, remembering to get date into a datetime dtype:

df['Date'] = pd.to_datetime(df['Date'])
df = (df.groupby('Name')
        .resample('D', on='Date')
        .mean()
        .swaplevel()
        .interpolate()
)

print(df)
                 Open Price  Close Price
Date       Name                         
2019-01-01 A     100.000000   200.000000
2019-01-02 A     150.000000   250.000000   
2019-01-03 A     200.000000   300.000000
2019-01-01 B     300.000000   400.000000
2019-01-02 B     333.333333   433.333333
2019-01-03 B     366.666667   466.666667
2019-01-04 B     400.000000   500.000000  
2019-01-01 C     500.000000   600.000000
2019-01-02 C     550.000000   650.000000
2019-01-03 C     600.000000   700.000000
manwithfewneeds
  • 1,137
  • 1
  • 7
  • 10