Fill gaps in DataFrame MultiIndex level 1, differently for each level 0

Question

I have a MultiIndex DataFrame with gappy date values on level 1, like this:

np.random.seed(456)
j = [(a, b) for a in ['A','B','C'] for b in random.sample(pd.date_range('2018-01-01', periods=100, freq='D').tolist(), 5)]
j.sort()
i = pd.MultiIndex.from_tuples(j, names=['Name','Date'])
df = pd.DataFrame(np.random.random_integers(0,100,15), i, columns=['Vals'])
# print(df):
                 Vals
Name Date            
A    2018-01-01    27
     2018-01-08    43
     2018-03-26    89
     2018-03-29    42
     2018-04-01    28
B    2018-01-02    79
     2018-01-26    60
     2018-02-18    45
     2018-03-11    37
     2018-03-23    92
C    2018-03-17    39
     2018-03-20    81
     2018-03-21    11
     2018-03-27    77
     2018-04-08    69

For each level 0 value, I want to fill in the index level 1 with every calendar date between the min and max date values for that level 0. (This Q&A addresses the scenario of filling in level 1 with the same value set for all level 0 values.)

E.g., for subset = df.loc['A'] I want to insert rows so that subset.index.values == pd.date_range(subset.index.values.min(), subset.index.values.max()).values. I.e., the resulting DataFrame would look like:

                 Vals
Name Date            
A    2018-01-01    27
     2018-01-02   NaN
     2018-01-03   NaN
     2018-01-04   NaN
     2018-01-05   NaN
     2018-01-06   NaN
     2018-01-07   NaN
     2018-01-08    43
     2018-01-09   NaN
...

Is there a pandaic way to accomplish this?

(The best I can come up with is to inefficiently and iteratively append new DataFrames for each level 0 value. Or similarly iteratively construct a list of index values and then pandas.concat them with the original DataFrame.)

score 5 · Accepted Answer · answered Mar 06 '18 at 21:12

5

You can use asfreq

df.groupby(level=0).apply(lambda x: x.reset_index(level=0, drop=True).asfreq("D"))

answered Mar 06 '18 at 21:12

BENY

317,841
20
164
234

This is clever ! – MaxU - stand with Ukraine Mar 06 '18 at 21:12
Damn ... brilliant! This is why I worry anytime I'm using more than one line of python to do something ;) – feetwet Mar 06 '18 at 21:14
@feetwet sometime multiple line is good for reading and speed:-) – BENY Mar 06 '18 at 21:16

Fill gaps in DataFrame MultiIndex level 1, differently for each level 0

1 Answers1