0

I have a questionnaire designed like this:

What is(are) your favorite food? 
1: Apple
2: Banana
3: Pear
4: Other ____

The result is like this: 
ID      Options chosen
1        Apple, Banana
2        Apple
3        Apple, Banana, Pear, Orange
4        Orange

So the count should be like this

Apple: 3
Banana: 2
Pear: 1
Other: 2

I tried using pandas, but still couldn't figure out the way to create the table:

options = ['Apple', 'Banana', 'Pear']
df['options'].isin(options)

But the result turns out to be all False, let alone the count for Other option. Can anyone help?

Sandy
  • 359
  • 4
  • 14
  • `df.options = df.options.str.split(', ')`, `df = df.explode('options').reset_index(drop=True)` and then `df.options.value_counts()` – Trenton McKinney Jun 07 '22 at 17:59

1 Answers1

0

You can use np.select() in combination with explode to get get the count of the items to specify and put those that aren't in the given list in the "other" category

df['Options chosen'] = df['Options chosen'].apply(lambda x : x.split(', '))
df = df.explode('Options chosen')
condition_list = [df['Options chosen'].isin(['Apple', 'Banana', 'Pear'])]
choice_list = [df['Options chosen']]
df['Options chosen'] = np.select(condition_list, choice_list, 'Other')
df = df.groupby('Options chosen').count().reset_index()
df
ArchAngelPwn
  • 2,891
  • 1
  • 4
  • 17