I have a pandas dataframe with MultiIndex columns, with 3 levels:
import itertools
import numpy as np
def mklbl(prefix, n):
return ["%s%s" % (prefix, i) for i in range(n)]
miindex = pd.MultiIndex.from_product([mklbl('A', 4)])
micolumns = pd.MultiIndex.from_tuples(list(itertools.product(['A', 'B'], ['a', 'b', 'c'], ['foo', 'bar'])),
names=['lvl0', 'lvl1', 'lvl2'])
dfmi = pd.DataFrame(np.arange(len(miindex) * len(micolumns)).reshape((len(miindex), len(micolumns))),
index=miindex,
columns=micolumns).sort_index().sort_index(axis=1)
lvl0 A B
lvl1 a b c a b c
lvl2 bar foo bar foo bar foo bar foo bar foo bar foo
A0 1 0 3 2 5 4 7 6 9 8 11 10
A1 13 12 15 14 17 16 19 18 21 20 23 22
A2 25 24 27 26 29 28 31 30 33 32 35 34
A3 37 36 39 38 41 40 43 42 45 44 47 46
I want to mask this dataframe, based on another dataframe, which has the last two levels of the index:
cols = micolumns.droplevel(0).unique()
a_mask = pd.DataFrame(np.random.randn(len(dfmi.index), len(cols)), index=dfmi.index, columns=cols)
a_mask = (np.sign(a_mask) > 0).astype(bool)
a b c
foo bar foo bar foo bar
A0 False False False True True False
A1 True False True False True True
A2 True True True True False False
A3 True False False True True False
What I would like to do is to mask the original dataframe according to a_mask
.
Let's say I want to set the original entries to zero, when a_mask
is true.
I tried to use pd.IndexSlice
, but it fails silently (i.e. I can run the following code, but has no effect:
dfmi.loc[:, pd.IndexSlice[:, a_mask]] = 0 #dfmi is unchanged
Any suggestion how to achieve this?
Edit In my use case, the labels are constructed with a cartesian product, so there will be all combinations of (lev0, lev1, lev2). But it is the case that lev0 can assume 2 values {A, B}, while lev1 can assume 3 values {a, b, c}