Pandas pivot multi-index table with no aggregation

Question

I have a pandas dataframe with a multi-index where several groupby aggregations have already been computed. I'm trying to pivot the final grouped column into headings, without any aggregation. I've managed to accomplish it with aggregation, but can't seem to figure out how to simply transpose it so the 2nd last column becomes headers and the last column becomes the values under each header.

A similar example:

df = pd.DataFrame.from_dict(
   {'Classifier':['SVM']*6 + ['CNN']*6,
    'Preprocess': (['None']*3 + ['PCA']*3)*2,
    'Group': ([1]*2 + [2]*2)*3,
    'Fold': ([1] + [2])*6,
    'Accuracy': np.random.rand((12))
    })
df = df.groupby(['Classifier', 'Preprocess', 'Group']).agg({'Accuracy': 'mean'})
print df
                             Accuracy
Classifier Preprocess Group          
CNN        None       1      0.912794
                      2      0.514453
           PCA        1      0.694035
                      2      0.740769
SVM        None       1      0.235105
                      2      0.817883
           PCA        1      0.567313
                      2      0.657962

I'm trying to get it to look something like:

                      Accuracy - Group
Classifier Preprocess 1         2        
CNN        None       0.912794  0.514453              
           PCA        0.694035  0.740769
SVM        None       0.235105  0.817883
           PCA        0.567313  0.657962

I should note that the 'Accuracy' column actually ends up being a string because of extra calculations. Any ideas? I feel like I've tried every combination of weird pivot/stack/unstack/re-index, but the closest I can get is a very short table where everything is aggregated down oddly when I want to actually maintain the final column numbers (no aggregation).

@peur - so need `df.unstack(-1)` only? No necessary `rename` ? — jezrael, Jan 11 '18 at 06:59
@jezrael I was mostly confused about the `unstack`, but the `rename` is very helpful as well — peur, Jan 11 '18 at 07:09
@peur - I ask it because from your question I understand you need also `rename`. So then it is not dupe. But if need only `unstack`, please unaccept and delete your question, because dupe. — jezrael, Jan 11 '18 at 07:11
@jezrael I'm relatively new to stack overflow, so I'm unsure on duplicate etiquette, but these meta post - [relevant meta-post](https://meta.stackoverflow.com/questions/265736/should-i-delete-my-question-if-it-is-marked-as-a-duplicate), [another](https://meta.stackexchange.com/questions/10841/how-should-duplicate-questions-be-handled) - seems to suggest I should leave it. Or should I unaccept your answer and click 'that solved my question' on the duplicate marker? — peur, Jan 11 '18 at 07:23
@peur - Hmmm, you cannot mark as accepted to another solution, which is not under your question. You can only upvote, if get reputation 15+. But if need only `unstack`, what is one function answer, then is clear dupe and no reason for leave it. — jezrael, Jan 11 '18 at 07:31
@peur - but if it is combination of multiple functions is possible it is not dupe, e.g. my solution is not dupe if need `unstack` and also `rename` first level of `MultiIndex` in columns. — jezrael, Jan 11 '18 at 07:34

jezrael · Accepted Answer · 2018-01-11T06:58:12.087

I think you need unstack + rename:

print (df.unstack().rename(columns=lambda x: x + ' - '+ df.index.names[-1], level=0))
                      Accuracy - Group          
Group                                1         2
Classifier Preprocess                           
CNN        None               0.879696  0.644665
           PCA                0.613153  0.502952
SVM        None               0.589627  0.323793
           PCA                0.619407  0.356250

Alternative:

df1 = df.unstack().rename(columns=lambda x: '{} - {}'.format(x, df.index.names[-1]), level=0)
print (df1)
                      Accuracy - Group          
Group                                1         2
Classifier Preprocess                           
CNN        None               0.376257  0.531367
           PCA                0.058679  0.064388
SVM        None               0.630307  0.908984
           PCA                0.312942  0.895200

Detail:

print (df.index.names)
['Classifier', 'Preprocess', 'Group']

print (df.index.names[-1])
Group

print (df.unstack())
                       Accuracy          
Group                         1         2
Classifier Preprocess                    
CNN        None        0.376257  0.531367
           PCA         0.058679  0.064388
SVM        None        0.630307  0.908984
           PCA         0.312942  0.895200

Pandas pivot multi-index table with no aggregation

1 Answers1