Adding a MultiIndex Summary Column at the Beginning of a DataFrame

Question

If I have a Dataframe and I'd like to insert a summary column at the beginning I can run

df.insert(0, 'Average', df.mean(axis='columns'))

Say I have a MultiIndexed dataframe of the form

df = pd.DataFrame()
for l1 in ('a', 'b'):
    for l2 in ('one', 'two'):
        df[l1, l2] = np.random.random(size=5)
df.columns = pd.MultiIndex.from_tuples(df.columns, names=['L1', 'L2'])

L1         a                   b          
L2       one       two       one       two
0   0.585409  0.563870  0.535770  0.868020
1   0.404546  0.102884  0.254945  0.362751
2   0.475362  0.601632  0.476761  0.665126 
3   0.926288  0.615655  0.257977  0.668778
4   0.509069  0.706685  0.355842  0.891862

How do I add the mean of all the one columns and all the two columns to the first two columns of this DataFrame and call it 'Average'?

EDIT: Expected output would be df.mean(level=1, axis=1) but inserted into the first two columns of the frame with the L1 label 'Average'. I was hoping the following would work:

df.insert(0, 'Average', df.mean(level=1, axis=1))

Can you provide the expected output? I think I understand but I'm not sure what you mean by "add the mean to the first two columns" — user3483203, Sep 21 '18 at 21:44

user3483203 · Accepted Answer · 2018-09-21T21:51:56.150

1

IIUC, You just need to groupby to calculate the mean, and then do a bit of work with the columns of the resulting series:

s = df.groupby(level=1, axis=1).mean()
s.columns = pd.MultiIndex.from_product([['Average'], s.columns])
pd.concat([s, df], 1)

    Average                   a                   b
        one       two       one       two       one       two
0  0.517939  0.713116  0.531990  0.578338  0.503889  0.847894
1  0.571197  0.676809  0.698986  0.425227  0.443409  0.928391
2  0.689653  0.399053  0.843179  0.069174  0.536126  0.728931
3  0.288367  0.197891  0.026974  0.026774  0.549761  0.369009
4  0.449904  0.590919  0.372560  0.556332  0.527247  0.625506

edited Sep 21 '18 at 21:51

answered Sep 21 '18 at 21:43

user3483203

50,081
9
65
94

mmm... so you can't use `insert` here? I thought it might be faster than `concat` for some of the larger frames I'm dealing with. – rhaskett Sep 21 '18 at 23:58
1

Unfortunately, you can't use `insert` for more than a single column – user3483203 Sep 22 '18 at 00:15

Adding a MultiIndex Summary Column at the Beginning of a DataFrame

1 Answers1