2

Given a sample MultiIndex:

idx = pd.MultiIndex.from_product([[0, 1, 2], ['a', 'b', 'c', 'd']])    
df = pd.DataFrame({'value' : np.arange(12)}, index=idx)

df

     value
0 a      0
  b      1
  c      2
  d      3
1 a      4
  b      5
  c      6
  d      7
2 a      8
  b      9
  c     10
  d     11

How can I efficiently convert this to a tabular format like so?

   a  b   c   d
0  0  1   2   3
1  4  5   6   7
2  8  9  10  11

Furthermore, given the dataframe above, how can I bring it back to its original multi-indexed state?


What I've tried:

pd.DataFrame(df.values.reshape(-1, df.index.levels[1].size), 
             index=df.index.levels[0], columns=df.index.levels[1])

Which works for the first problem, but I'm not sure how to bring it back to its original from there.

cs95
  • 379,657
  • 97
  • 704
  • 746

3 Answers3

5

Using unstack and stack

In [5359]: dff = df['value'].unstack()

In [5360]: dff
Out[5360]:
   a  b   c   d
0  0  1   2   3
1  4  5   6   7
2  8  9  10  11

In [5361]: dff.stack().to_frame('name')
Out[5361]:
     name
0 a     0
  b     1
  c     2
  d     3
1 a     4
  b     5
  c     6
  d     7
2 a     8
  b     9
  c    10
  d    11
Zero
  • 74,117
  • 18
  • 147
  • 154
2

By using get_level_values

pd.crosstab(df.index.get_level_values(0),df.index.get_level_values(1),values=df.value,aggfunc=np.sum)
Out[477]: 
col_0  a  b   c   d
row_0              
0      0  1   2   3
1      4  5   6   7
2      8  9  10  11
BENY
  • 317,841
  • 20
  • 164
  • 234
2

Another alternative, which you should think of when using stack/unstack (though unstack is clearly better in this case!) is pivot_table:

In [11]: df.pivot_table(values="value", index=df.index.get_level_values(0), columns=df.index.get_level_values(1))
Out[11]:
   a  b   c   d
0  0  1   2   3
1  4  5   6   7
2  8  9  10  11
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • Just curious, any benefit of `get_level_values` over `levels[i]`? – cs95 Oct 24 '17 at 03:37
  • 1
    @cᴏʟᴅsᴘᴇᴇᴅ I don't think so, only thing is sometimes it's nicer to reference level by name, which get_level_values allows (so am used to writing it). – Andy Hayden Oct 24 '17 at 03:38