0

I would like to simplify the number of categories for one variable. The piece of code below is working:

df.loc[(df['category'] == 'cat1')|(df['category'] == 'cat2')|(df['category'] == 'cat3')|...|(df['category'] == 'catn'),'category'] == 'other'

but I was wondering if I could do something like:

category_to_change = ['cat1','cat2','cat3',...,'catn']

for name in category_to_change:
    df.loc[(df['category'] == name),'category'] == 'other'

(this doesn't work)

Any ideas how to do?

tripleee
  • 175,061
  • 34
  • 275
  • 318
Pierre
  • 13
  • 4

1 Answers1

0

It is better if you provide extra code when asking a question, typically the code to create the dataframe, this helps to test suggestions. This code should work :

df = pd.DataFrame({'category': ['cat', 'dog', 'cat', 'rat']})
df['category'] = df['category'].replace(['cat', 'dog'], 'other')

All occurrences of cat or dog are replaced by other.

arhr
  • 1,505
  • 8
  • 16