2

My data looks like below:

id, date, target
1,2016-10-24,22
1,2016-10-25,31
1,2016-10-27,44
1,2016-10-28,12
2,2016-10-21,22
2,2016-10-22,31
2,2016-10-25,44
2,2016-10-27,12

I want to fill in missing dates among id. For example, the date range of id=1 is 2016-10-24 ~ 2016-10-28, and 2016-10-26 is missing. Moreover, the date range of id=2 is 2016-10-21 ~ 2016-10-27, and 2016-10-23, 2016-10-24 and 2016-10-26 are missing. I want to fill in the missing dates and fill in the target value as 0.

Therefore, I want my data to be as below:

id, date, target
1,2016-10-24,22
1,2016-10-25,31
1,2016-10-26,0
1,2016-10-27,44
1,2016-10-28,12
2,2016-10-21,22
2,2016-10-22,31
2,2016-10-23,0
2,2016-10-24,0
2,2016-10-25,44
2,2016-10-26,0
2,2016-10-27,12

Can somebody help me?

Thanks in advance.

Mansumen
  • 373
  • 2
  • 4
  • 17

1 Answers1

6

You can use groupby with resample - then is problem fillna - so need asfreq first:

#if necessary convert to datetime
df.date = pd.to_datetime(df.date)
df = df.set_index('date')
df = df.groupby('id').resample('d')['target'].asfreq().fillna(0).astype(int).reset_index()
print (df)
    id       date  target
0    1 2016-10-24      22
1    1 2016-10-25      31
2    1 2016-10-26       0
3    1 2016-10-27      44
4    1 2016-10-28      12
5    2 2016-10-21      22
6    2 2016-10-22      31
7    2 2016-10-23       0
8    2 2016-10-24       0
9    2 2016-10-25      44
10   2 2016-10-26       0
11   2 2016-10-27      12
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252