2

I am very close to go insane. I have a dataframe like this:

subject  sessionIndex  screenIndex  index      key    time
   s019             1            3      1    Shift  0.3442
   s019             1            3      2  Shift.t  0.1514
   s019             1            3      3        h  0.0844
   s019             1            3      4        e  0.1127
   s019             1            3      5    space  0.1201
   s091             3            5    821        h  0.1126
   s091             3            5    822        a  0.1425
   s091             3            5    823        n  0.0926
   s091             3            5    824        d  0.1525

after using:

pivot_table(data,values='time', rows=['subject','sessionIndex','screenIndex','index'], cols=['key'])

I have the following dataframe:

key                                      Shift  Shift.t       a       d   ...
subject sessionIndex screenIndex index                                    
s019    1            3           1      0.3442      NaN     NaN     NaN   ...
                                 2         NaN   0.1514     NaN     NaN   ...
                                 3         NaN      NaN     NaN     NaN   ...
                                 4         NaN      NaN     NaN     NaN   ...
                                 5         NaN      NaN     NaN     NaN   ...
s091    3            5           821       NaN      NaN     NaN     NaN   ...
                                 822       NaN      NaN  0.1425     NaN   ...
                                 823       NaN      NaN     NaN     NaN   ...
                                 824       NaN      NaN     NaN  0.1525   ...

That is great but I got stuck to "unfold" the multiindex so that my dataframe looks like this:

subject sessionIndex screenIndex index   Shift  Shift.t       a       d   ...                               
s019    1            3           1      0.3442      NaN     NaN     NaN   ...
s019    1            3           2         NaN   0.1514     NaN     NaN   ...
s019    1            3           3         NaN      NaN     NaN     NaN   ...
s019    1            3           4         NaN      NaN     NaN     NaN   ...
s019    1            3           5         NaN      NaN     NaN     NaN   ...
s091    3            5           821       NaN      NaN     NaN     NaN   ...
s091    3            5           822       NaN      NaN  0.1425     NaN   ...
s091    3            5           823       NaN      NaN     NaN     NaN   ...
s091    3            5           824       NaN      NaN     NaN  0.1525   ...

I already tried reindexing and index reset. I am fairly new to pandas and python so maybe I am just to stupid. Please let me know what I am missing here, Thx.

  • I'm not sure what you mean by "unfold". Although repeating indicies are hidden for display purposes, they are still there. But perhaps you wish the four MultiIndex level were normal columns -- in which case they would display as you specified and would in some ways be easier to handle. In that case, just use the result of ``df.reset_index()``. – Dan Allan Aug 12 '13 at 19:32
  • Hi Dan, thx for your reply. I already tried this. The issue is that this command also reverts all the variables back into cases. So my data is as before I called pivot_table. – user2675976 Aug 12 '13 at 21:27
  • Hmm. That's surprising. Are you sure you are executing reset_index on the result of pivot_table, and not on the original DataFrame? – Dan Allan Aug 12 '13 at 22:27
  • I am going nuts. It looks you are right. I added a temporary variable to store the results of pivot_table and now it seem to work! Big THX would never figure this out alone! – user2675976 Aug 12 '13 at 22:33

1 Answers1

0

This is simply a display option in pandas.

Try adding this line immediately after importing pandas: pandas.set_printoptions(multi_sparse=False)

Paul H
  • 65,268
  • 20
  • 159
  • 136
  • Hi Paul, thx for your reply. My problem is actually not the formatting. I want to use the columns to execute some selectors. This is much harder when the columns are joined in an multiindex. Therfore I want to restore the index columns. – user2675976 Aug 12 '13 at 21:32
  • @user2675976 It's better to keep the index, IMO. The `dataframe.select` method can very powerfully execute selections on multiply values of a `MultiIndex` (e.g., `subset = df.select(lambda x: x[0] == 'S019' and s[2] in [3,5] and s[3] <= 5)` – Paul H Aug 13 '13 at 13:56
  • Thx Paul. I solved my issue. As I wanted to use different selection criterias afterwards: train = data[data['subject']==user] train = train[train['sessionIndex']==1] it seems easier to restore the inices. – user2675976 Aug 14 '13 at 13:18