0

If the dataset is given in characters i.e. Categorical, then we need to convert them into numerical data using one hot encoding ?

My second question is that, One hot encoding only is meaningful for the Nominal datatype or its meaningful for both Nominal and Ordinal data types ?

Justyna MK
  • 3,523
  • 3
  • 11
  • 25

1 Answers1

0

It is indeed required to convert Categorical variables into numerical form before submitting it to a model ( although some model implementation are doing it automatically). One Hot Encoding is one way to do it, but there are much more "Encoders" you could choose ( Ordinal Encoding, Binary Encoding, Hashing Encoding , ... ), which all fit a different situation.

For the second question, it does not really matter if your data is Nominal or Ordinal, the only thing that really matter is that your data is Categorical.

That said, if your data is Ordinal, the model would accept it. But an Ordinal can be bad in some situation as to introduce a "distance notion" between your categories. For example if you have, this encoding for means of transportation :

  • 1 -> car
  • 2 -> bus
  • 3 -> metro
  • 4 -> bike

The model would understand that bike is closer to metro than to car , which is an information that you may not want to give to your model. The One hot Encoding solve this issue by putting each categories at the same distance from one another.