I have a DataFrame with three levels on the main index:
from pandas import *
df_multi = DataFrame(np.random.rand(6,2), index = [['CF', 'CF', 'CF', 'DA', 'DA','DA'], ['x', 'y', 'y', 'x', 'y', 'y'], ['a', 'b', 'a', 'a', 'a', 'b']], columns = ['PC1', 'PC2'])
df_multi.index.names =['l1','l2','l3']
In [5]: df_multi
Out[5]:
PC1 PC2
l1 l2 l3
CF x a 0.118061 0.473159
y b 0.159534 0.407676
a 0.466731 0.163322
DA x a 0.152799 0.333438
y a 0.632725 0.965348
b 0.737112 0.834592
Now I want to sum across the third level and then divide each element by its corresponding sum to get shares at the third level (e.g. divide (CF, x, a)
by (CF, x, a)
and (CF, y, a)
by (CF, y, a) + (CF, y, b)
, etc.)
In [6]: df_multi.sum(level = [0, 1])
Out[6]:
PC1 PC2
l1 l2
CF x 0.118061 0.473159
y 0.626265 0.570998
DA x 0.152799 0.333438
y 1.369837 1.799940
and
df_multi_share = df_multi.div(df_multi.sum(level = [0, 1]), level=[0, 1])
However this does not work. I am looking for a general solution, not confined to the computation of percentage shares, that enables me to do arithmetics matching on multiple levels. It does work when using only one level, e.g.
df_multi = DataFrame(np.random.rand(4,2), index = [['CF', 'CF', 'DA', 'DA'], ['1', '2', '1', '2']], columns = ['PC1', 'PC2'])
df_single = DataFrame(np.random.rand(3,3), index = ['1', '2', '3'], columns = ['PC1', 'PC2', 'PC3'])
df_combined = df_multi.mul(df_single, level = 1)