Pandas pivot_table preserve order

Question

>>> df
   A   B   C      D
0  foo one small  1
1  foo one large  2
2  foo one large  2
3  foo two small  3
4  foo two small  3
5  bar one large  4
6  bar one small  5
7  bar two small  6
8  bar two large  7
>>> table = pivot_table(df, values='D', index=['A', 'B'],
...                     columns=['C'], aggfunc=np.sum)
>>> table
          small  large
foo  one  1      4
     two  6      NaN
bar  one  5      4
     two  6      7

I want the output to be as shown above, but I get a sorted output. bar comes above foo and so on.

ayhan · Answer 1 · 2017-07-08T17:22:02.403

I think pivot_table doesn't have an option for sorting, but groupby has:

df.groupby(['A', 'B', 'C'], sort=False)['D'].sum().unstack('C')
Out: 
C        small  large
A   B                
foo one    1.0    4.0
    two    6.0    NaN
bar one    5.0    4.0
    two    6.0    7.0

You pass the grouping columns to groupby and for the ones you want to show as column values, you use unstack.

If you don't want the index names, rename them as None:

df.groupby(['A', 'B', 'C'], sort=False)['D'].sum().rename_axis([None, None, None]).unstack(level=2)
Out: 
         small  large
foo one    1.0    4.0
    two    6.0    NaN
bar one    5.0    4.0
    two    6.0    7.0

score 8 · Answer 2 · answered Nov 15 '21 at 15:57

Since pandas 1.3.0, it's possible to specify sort=False in pd.pivot_table:

>>> import pandas as pd
>>> df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar"],
...                    "B": ["one", "one", "one", "two", "two", "one", "one", "two", "two"],
...                    "C": ["small", "large", "large", "small","small", "large", "small", "small", "large"],
...                    "D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
...                    "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
>>> pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'],
...                aggfunc='sum', sort=False)
C        large  small
A   B                
foo one    4.0    1.0
    two    NaN    6.0
bar one    4.0    5.0
    two    7.0    6.0

niraj · Accepted Answer · 2017-07-08T17:31:48.853

3

While creating pivot_table, the index is automatically sorted alphabetically. Not only foo and bar, you may also notice small and large is sorted. If you want foo on top, you may need to sort them again using sortlevel. If you are expecting output as in example here, then sorting on A and C both may be needed.

table.sortlevel(["A","B"], ascending= [False,True], sort_remaining=False, inplace=True)
table.sortlevel(["C"], axis=1, ascending=False,  sort_remaining=False, inplace=True)
print(table)

Output:

C        small  large
A   B                
foo one  1.0    4.0  
    two  6.0    NaN   
bar one  5.0    4.0  
    two  6.0    7.0

Update:

To remove index names A, B and C:

table.columns.name = None
table.index.names = (None, None)

edited Jul 08 '17 at 17:31

answered Jul 08 '17 at 17:12

niraj

17,498
4
33
48

How do I remove C A B from the above given solution?? small large foo one 1 4 two 6 NaN bar one 5 4 two 6 7 – Rahul Ranjan Jul 08 '17 at 17:19
For index there is multilevel i.e. you have `A` and `B` so you need `index.names`. You can take a look at https://stackoverflow.com/a/30254337/5916727 . I saw you mentioned you are beginner so, best thing would be to try., for e.g. what is returned by `table.index`, `table.cloumns.name` ...... – niraj Jul 08 '17 at 17:40
+1 for the (obvious, in hindsight) suggestion about how to remove the index labels (if you get a NaN using None, simply use an empty string) – RobM Oct 11 '19 at 07:35

Pandas pivot_table preserve order

3 Answers3

Update:

Linked

Related