1

I'm running out of ideas with this one: I'm using a dataset that is read in using

import pandas as pd
data = pd.read_csv('data.csv', index_col=[0], names=['Date', 'GDAXI', 'GSPC'], header=0)
data

Output:

             GDAXI              GSPC
Date        
2019-07-23  12490.740234    3005.469971
2019-07-24  12522.889648    3019.560059
2019-07-25  12362.099609    3003.669922
2019-07-26  12419.900391    3025.860107
2019-07-27  12419.900391    3025.860107
... ... ...
2020-07-17  12919.610352    3224.729980
2020-07-20  13046.919922    3251.840088
2020-07-21  13171.830078    3257.300049
2020-07-22  13104.250000    3276.020020
2020-07-23  13103.389648    3256.409912
261 rows × 2 columns

There are missing dates (weekends), that I want to fill with 0 using

data = data.reindex(dates, fill_value=0)

This gives the following output:

          GDAXI GSPC
2019-07-23  0.0 0.0
2019-07-24  0.0 0.0
2019-07-25  0.0 0.0
2019-07-26  0.0 0.0
2019-07-27  0.0 0.0
... ... ...
2020-07-19  0.0 0.0
2020-07-20  0.0 0.0
2020-07-21  0.0 0.0
2020-07-22  0.0 0.0
2020-07-23  0.0 0.0
367 rows × 2 columns

So for some reason reindex() is interpreting everything as missing data.

Has anyone got any ideas what's going on? Cheers!

Paul
  • 23
  • 2

2 Answers2

1

Try this -

  1. Convert Date column to Datetime using pd.to_Datetime function
  2. Now, set Date as index using data=data.set_index('Date')
  3. Generate the list 'dates' using timedelta using the start and end Date as a reference
  4. Now this should work - data = data.reindex(dates, fill_value=0)
MHK
  • 156
  • 1
  • 5
1

Expanding on @mahir kukreja answer, this works for me:

df.index = pd.to_datetime(df.index)
dates = pd.date_range(df.index.min(), df.index.max())
df = df.reindex(dates, fill_value=0)
kait
  • 1,327
  • 9
  • 13