specific values count in groupby

Question

I have the following dataframe:

df = pd.DataFrame([
    ('A', 'A', 'Cat'),
    ('A', 'A', 'Dog'),
    ('A', 'A', 'Cat'),
    ('A', 'B', 'Dog'),
    ('B', 'B', 'Rat'),
    ('B', 'C', 'Cat'),
    ('B', 'C', 'Cat')
], columns=['id', 'group', 'Animal'])

I want to group it by id and group and calculate the occurrence of Cat in each group. An example output will be:

[2, 0, 0, 2]

2 cat in group AA,

0 cat in group AB,

0 cat in group BB,

2 cat in group BC

Can anyone help? Thanks!

Does this answer your question? [How to do a conditional count after groupby on a Pandas Dataframe?](https://stackoverflow.com/questions/45752601/how-to-do-a-conditional-count-after-groupby-on-a-pandas-dataframe) — Reslan Tinawi, May 05 '21 at 07:20

cs95 · Accepted Answer · 2020-03-05T04:10:52.560

8

You'll need mask + groupby:

df['Animal'] = df['Animal'].mask(df['Animal'].ne('Cat'))
df.groupby(['id', 'group'])['Animal'].count().tolist()

[2, 0, 0, 2]

edited Mar 05 '20 at 04:10

answered Apr 09 '18 at 03:55

cs95

379,657
97
704
746

Note: `ne` is "not equal" – scottlittle Mar 04 '20 at 21:10

score 6 · Answer 2 · answered Apr 09 '18 at 03:58

You can using value_counts

df.groupby(['id','group']).Animal.value_counts().unstack(fill_value=0).loc[:,'Cat'].tolist()
Out[144]: [2, 0, 0, 2]

More info return before the last step .loc

df.groupby(['id','group']).Animal.value_counts().unstack(fill_value=0)
Out[145]: 
Animal    Cat  Dog  Rat
id group               
A  A        2    1    0
   B        0    1    0
B  B        0    0    1
   C        2    0    0

score 4 · Answer 3 · answered May 05 '21 at 07:34

Your question is very similar (if not duplicate) to this question.

When you want to perform some aggregation which are not part of pandas (e.g. sum, max, min) then you should use the apply method.

Please note that apply can be slower, so don't start using everywhere.

Now, for your question, you start with grouping by:

df.groupby(by=["id", "group"])

Then, you select the Animal column from the groups:

df.groupby(by=["id", "group"])['Animal']

At this point, you have values from the Animal column for each different group, ready for any aggregation.

Let's define a function to count Cats in a pandas Series:

def count_cat(series):
    return (series == 'Cat').sum()

All you have to do now, is to pass this function to the apply method, like the following:

df.groupby(by=["id", "group"])['Animal'].apply(count_cat).reset_index(name='count_cat')

And the result would be:

	id	group	count_cat
0	A	A	2
1	A	B	0
2	B	B	0
3	B	C	2

score 0 · Answer 4 · answered Apr 08 '20 at 18:12

0

if 0 values are not important than you can try:

list(df[df['Animal']=='Cat'].groupby(['id', 'group']).count()['Animal'])

Sorry for my English

answered Apr 08 '20 at 18:12

David Saghinadze

1
1

specific values count in groupby

4 Answers4