How to do Label encoding in Azure ML studio?

Question

I have a total of around 80 columns out of which some 20 columns are categorical which needs to be label encoded. I checked the solution provided here and the solution stated to work with Feature Hashing technique. But the feature hashing technique converts similar to One-Hot encoding and not label encoding.

Example:

Column1
RL
RL
RM
RL
RM
RM

After feature hashing the code turns out to be similar to one-hot encoding as:

Column1-RL     Column1-RM
1              0
1              0
0              1
1              0
0              1
0              1

How to do similar to label encoding in Azure-ml-studio to make the output as similar to:

Column1
1
1
2
1
2
2

Sairam Tadepalli · Answer 1 · 2022-09-14T03:24:23.597

In ML Studio we can perform predictions using three different features. We can perform using Notebooks which look like Jupyter notebook. The second pattern is using AutoML. Using this AutoML feature, we will get to implement the prediction model automatically with the pre-defined rules and finally designer. Designer is a tool which will take all the requirements in the form of a node and connect each node to another node based on input and output.

The label encoder is not available directly as an exclusive option in AutoML and Designer. This feature is embedded in Notebook with programming structure. In AutoML it will be performed by the model itself internally once we start running the model after uploading the dataset. The labels will be generated and visible in AutoML dataset output for validation.

import pandas as pd

df = pd.read_csv(“filename.csv”)

df.head() # to get top 5 rows of the dataset

df.dtypes #types of every variable

#we need to implement the label encoder of object variable.

df['target'].unique() # we will get the unique variables

df[“Class”].value_count() #get the count of each category

from sklearn.preprocessing import LabelEncoder

lable_encoder = LabelEncoder() #created the object for label encoder class

#Implement label encoder on the target variable. And save that to original dataframe

df['target] = label_encoder.fit_transform(df[“Class”]) #transformed and replaced with original dataframe

To check whether the dataset is updated or not.

df.dtypes #use this method to get the updated dataset data types for each column(feature).

df['target'].unique() # we will get number for each category in that column

To know the count of each category

df['target'].value_counts() # will get total amount of count for each category

To run this feature on Azure platform we need to create a resource and use the subscription key and use the above code in the notebook.

In the case of AutoML, run the model by uploading the dataset and the result will be scene like below after the modelling.

How to do Label encoding in Azure ML studio?

1 Answers1