-1

I am working on a regression problem. I have a categorical column which has 24 categorical value.One-hot encoding is showing too many dummy variable. Is there a way to avoid multiple dummy variable trap. Kindly guide me here is my sample of the categorical column enter image description here

After label encoding

enter image description here

Thank you

Kalyan
  • 1,880
  • 11
  • 35
  • 62

1 Answers1

0

You can use this:

df['column'] = df['column'].astype('category').cat.codes

Example:

df = pd.DataFrame(['a','b','c','d','a','c','a','d'], columns=['column'])

Output:

   column
0       0
1       1
2       2
3       3
4       0
5       2
6       0
7       3
Joe
  • 12,057
  • 5
  • 39
  • 55