Count the frequency of unique values in every pandas column using groupby

Question

How can i get all unique values and their frequencies in each dataframe column using groupby?

I have tried stats.mode which gives info on the most frequent value which i would like to expand with frequenices of other column values.

from scipy import stats 
source = pd.DataFrame({'col1' : ['A', 'A', 'A','A'], 
                  'col2' : ['a', 'a', 'a', 'b'],
                  'col3' : ['c','c','d','d']})
source.groupby(['col1'])[['col2', 'col3']].agg(stats.mode)

result:

col1 
                 col2         col3      
A                ([a], [3])   ([c], [2])

I would like to get:

col1 
                 col2               col3        
A                ([a, b], [3,1])    ([c,d], [2,2])

mozway · Answer 1 · 2022-07-12T14:16:29.797

2

You need a custom function:

def cust_mode(s):
    counts = s.value_counts()
    return list(counts.index), list(counts)

out = source.groupby(['col1'])[['col2', 'col3']].agg(cust_mode)

output:

                  col2              col3
col1                                    
A     ([a, b], [3, 1])  ([c, d], [2, 2])

edited Jul 12 '22 at 14:16

answered Jul 12 '22 at 13:56

mozway

194,879
13
39
75

Note that OP wants tuple instead of list data types – ko3 Jul 12 '22 at 13:59
1

Cheers, as always smooth solution. Note: just `return list(counts.index), list(counts)`, not the list data types must be converted to tuples, but the values of the data frame – ko3 Jul 12 '22 at 14:07
1

@ko3 damn I need sleep, but OP can figure this out easily ;) – mozway Jul 12 '22 at 14:15

score 0 · Answer 2 · answered Jul 12 '22 at 13:55

0

for k in df.columns:
    print(df[k].value_counts())

Gives you what you want. If you want to include nan, dropna=False between the parenthesis of the value_counts

answered Jul 12 '22 at 13:55

Adept

522
3
16

Count the frequency of unique values in every pandas column using groupby

2 Answers2