6

This is my csv look like,

name, cuisine, review
A, Chinese, this
A, Indian, is
B, Indian, an
B, Indian, example
B, French, thank
C, French, you

I trying to count how many times the diff kind of cuisines appear by name. This is what I should be getting

Cuisine, Count
Chinese, 1
Indian, 2
French, 2

But as you can see there are duplicates within the name e.g. B so I try to drop_duplicates but I can't. I use

df.groupby('name')['cuisine'].drop_duplicates() 

and it says series groupby object cannot.

Somehow I need to apply value_counts() to get the number of occurrences of the cuisine word but the duplicates thing is hindering. Any idea how I can get this in pandas? Thanks.

wayneloo
  • 251
  • 4
  • 10

2 Answers2

4

You're looking for groupby and nunique:

df.groupby('cuisine', sort=False).name.nunique().to_frame('count')

         count
cuisine       
Chinese      1
Indian       2
French       2

Will return the count of unique items per group.

cs95
  • 379,657
  • 97
  • 704
  • 746
2

Using crosstab

pd.crosstab(df.name,df.cuisine).ne(0).sum()
Out[550]: 
cuisine
 Chinese    1
 French     2
 Indian     2
dtype: int64
BENY
  • 317,841
  • 20
  • 164
  • 234