0

Let's say I have the following data.

dates=['2020-12-01','2020-12-04','2020-12-05', '2020-12-01','2020-12-04','2020-12-05']
symbols=['ABC','ABC','ABC','DEF','DEF','DEF']
v=[1,3,5,7,9,10]
df= pd.DataFrame({'date':dates, 'g':symbols, 'v':v})
         date    g   v
0  2020-12-01  ABC   1
1  2020-12-04  ABC   3
2  2020-12-05  ABC   5
3  2020-12-01  DEF   7
4  2020-12-04  DEF   9
5  2020-12-05  DEF  10

I'd like to fill the missing dates with previous value (group by field 'g') For example, I want the following entrees added in the above example:

2020-12-02 ABC 1
2020-12-03 ABC 1
2020-12-02 DEF 7
2020-12-03 DEF 7

how can I do this?

Akshay Sehgal
  • 18,741
  • 3
  • 21
  • 51

1 Answers1

0

The answer is borrowed mostly from the following answer, with the exception of filling with a negative value and using that to replace with nulls for the forward fill.

Original Answer Here

dates=['2020-12-01','2020-12-04','2020-12-05', '2020-12-01','2020-12-04','2020-12-05']
symbols=['ABC','ABC','ABC','DEF','DEF','DEF']
v=[1,3,5,7,9,10]
df= pd.DataFrame({'date':dates, 'g':symbols, 'v':v})

df['date'] = pd.to_datetime(df['date'])

df = df.set_index(
    ['date', 'g']
).unstack(
    fill_value=-999
).asfreq(
    'D', fill_value=-999
).stack().sort_index(level=1).reset_index()

df.replace(-999, np.nan).ffill()
Chris
  • 15,819
  • 3
  • 24
  • 37