I have a DataFrame
with a MultiIndex
. The first level is a DatetimeIndex
with weekly frequency. The second level is NOT consistent across groupings by the first level.
I want to group the first level by month and take the first weeks rows.
Setup
midx = pd.MultiIndex.from_arrays([
pd.date_range('2018-01-01', freq='W', periods=10).repeat(2),
list('ABCDEFGHIJ' * 2)
], names=['Date', 'Thing'])
df = pd.DataFrame(dict(Col=np.arange(10, 30)), midx)
Expected Results
df
Col
Date Thing
2018-01-07 A 10 # This is the first week
B 11 # of January 2018
2018-01-14 C 12
D 13
2018-01-21 E 14
F 15
2018-01-28 G 16
H 17
2018-02-04 I 18 # This is the first week
J 19 # of February 2018
2018-02-11 A 20
B 21
2018-02-18 C 22
D 23
2018-02-25 E 24
F 25
2018-03-04 G 26 # This is the first week
H 27 # of March 2018
2018-03-11 I 28
J 29
Results should be
Col
Date Thing
2018-01-07 A 10 # This is the first week
B 11 # of January 2018
2018-02-04 I 18 # This is the first week
J 19 # of February 2018
2018-03-04 G 26 # This is the first week
H 27 # of March 2018
Attempt
df.unstack().asfreq('M', 'ffill').stack()
Col
Date Thing
2018-01-31 G 16.0
H 17.0
2018-02-28 E 24.0
F 25.0
This is wrong on several levels.
- Date is actual month end and not the actual date observed.
- Things are not from the correct date. Notice that I wanted
['A', 'B']
from'2018-01-07'
and not['G', 'H']
. - I'm unstacking to enable myself to use
asfreq
but that introducesnan
and converts tofloat
- I don't know what happened to
March 2018