1

Please help me be more pythonic:

I am label encoding all categorical features with Pandas. I know this can also be done with Sklearn but I'd like to do it with Pandas or Python alone.

I did this by first selecting all columns of type 'obj' which happened to be cat (I am dealing with a small dataframe so I know this for sure). Then, I used a for loop to convert each column.

I know I can definitely do this by avoiding the for loop. The more pythonic the better:

cat_cols = df.select_dtypes(include='object').columns

for col in cat_cols:
    df[col] = df[col].astype('category').cat.codes
Odisseo
  • 747
  • 1
  • 13
  • 32

1 Answers1

2

Base on this link for loop is not always 'bad', if you do need get ride of it , you can using apply

cat_cols = df.select_dtypes(include='object').columns
df[cat_cols ] = df[cat_cols ].apply(lambda x : x.astype('category').cat.codes,axis=1)
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Well but based on the above, the second line of code only gets applied to one column. How would I get it to apply to all columns in the cat_cols list? – Odisseo Jan 22 '19 at 01:44
  • 1
    @Odisseo sorry change col to cat_cols – BENY Jan 22 '19 at 01:46
  • Just for record, the solution above actually takes longer than the original with the for loop – Odisseo Jan 22 '19 at 05:05