1

Sample Dataset (Note that each combination of Col_A and Col_B is unique):

import pandas as pd

d = {'Col_A': [1,2,3,4,5,6,9,9,10,11,11,12,12,12,12,12,12,13,13], 
     'Col_B': ['A','K','E','E','H','A','J','A','L','A','B','A','J','C','D','E','A','J','L'],
     'Value':[180,120,35,654,789,34,567,21,235,83,234,648,654,234,873,248,45,67,94]
    }

df = pd.DataFrame(data=d)

Sample Data

The requirement is to generate a table with each Col_B's amount, Col_A's counts, and total amount per Col_A. Show the categories in Col_B in descending order by their total amount.

This is what I have so far:

df.groupby(['Col_B','Col_A']).agg(['count','sum'])

The output would look like this. However, I'd like to add subtotals for each Col_B category and rank those subtotals of the categories in descending order so that it fulfills the requirement of getting each Col_B's amount.

enter image description here

Thanks in advance, everyone!

Jiamei
  • 405
  • 3
  • 14
  • kindly add your expected output dataframe – sammywemmy Dec 05 '21 at 08:24
  • @sammywemmy Thanks for your kind reminder. Since there's already an answer with my desired output, I will not make more edits to this post but will make sure to add one next time when I have a question :) – Jiamei Dec 05 '21 at 18:22

1 Answers1

1

The expected result is not clear for me but is it what your are looking for?

piv = df.groupby(['Col_B', 'Col_A'])['Amount'].agg(['count','sum']).reset_index()
tot = piv.groupby('Col_B', as_index=False).sum().assign(Col_A='Total')
cat = pd.CategoricalDtype(tot.sort_values('sum')['Col_B'], ordered=True)

out = pd.concat([piv, tot]).astype({'Col_B': cat}) \
        .sort_values('Col_B', ascending=False, kind='mergesort') \
        .set_index(['Col_B', 'Col_A'])
>>> out
             count   sum
Col_B Col_A             
J     9          1   567
      12         1   654
      13         1    67
      Total      3  1288
A     1          1   180
      6          1    34
      9          1    21
      11         1    83
      12         2   693
      Total      6  1011
E     3          1    35
      4          1   654
      12         1   248
      Total      3   937
D     12         1   873
      Total      1   873
H     5          1   789
      Total      1   789
L     10         1   235
      13         1    94
      Total      2   329
C     12         1   234
      Total      1   234
B     11         1   234
      Total      1   234
K     2          1   120
      Total      1   120
Corralien
  • 109,409
  • 8
  • 28
  • 52
  • Hi, @Corralien! Thanks for your answer. One more question. Suppose Col_A is IDNumbers. Is there a way to show the count subtotal as unique ID counts? For example, instead of 6 for Category A Total, it would return 5, since ID 12 has occurred twice. – Jiamei Dec 06 '21 at 00:14
  • Replace `tot = ...` by `tot = piv.groupby('Col_B', as_index=False).agg({'count': 'size', 'sum': 'sum'}).assign(Col_A='Total')`. Is it what you expect? – Corralien Dec 06 '21 at 07:07