Label Encoding refers to converting categorical labels in a data set used for machine learning purposes, into numeric form. Machine learning algorithms can then decide in a better way on how those labels must be operated. It is an important pre-processing step for a structured data set in supervised learning.
Questions tagged [label-encoding]
119 questions
2
votes
0 answers
le.transform() ValueError: y contains previously unseen labels: [1, 2, 3, 4]
I'm running a very basic code to create encoder classes, and then use the same classes to encode a new dataframe. In this code, I don't need to use np.save and np.load, however in my actual implementation, I will need to re-load the encoder to…

Jo Bennet
- 131
- 6
2
votes
0 answers
Spark: What is the best way to do label encoding on a feature of variable length?
For Spark, there is a StringIndexer in Spark ML that can do label encoding for a given column. However it cannot directly handle the situation where the column is variable length feature (or multi-value feature). For example,…

CyberPlayerOne
- 3,078
- 5
- 30
- 51
1
vote
1 answer
i can't apply labelencoder to array of bool
I am on a machine learning project. I did import all libraries. I took one column of data(this column is array of bool) and i want to apply it labelencoder.
Here is my whole code.
data = pd.read_csv('odev_tenis.csv')
le =…

metkopetru
- 27
- 7
1
vote
2 answers
how do i filter columns with data_type= object
encoder=LabelEncoder()
categorical_features=df.columns.tolist()
for col in categorical_features:
df[col]=encoder.fit_transform(df[col])
df.head(20)
**i want categorical_features to take columns with datatype=object

aarthi sharma
- 5
- 1
1
vote
1 answer
Why the index of Label Encoding is not seriated?
This is my label value:
df['Label'].value_counts()
------------------------------------
Benign 4401366
DDoS attacks-LOIC-HTTP 576191
FTP-BruteForce 193360
SSH-Bruteforce 187589
DoS attacks-GoldenEye …

Dead
- 11
- 3
1
vote
0 answers
How to encode the new df values with existing LabelEncoder
I am quite new to ML can anyone please help me,
I am facing issue while encoding and decoding below mentioned DF using preprocessing.LabelEncoder()
df.head()
Col1 | Col2 | Col3 | Col4 | Col5 | Col6
0 | Minor | Yes | …

Ankit Bijlwan
- 11
- 2
1
vote
1 answer
LabelEncoding in Pandas on a column with list of strings across rows
I would like to LabelEncode a column in pandas where each row contains a list of strings. Since a similar string/text carries a same meaning across rows, encoding should respect that, and ideally encode it with a unique number. Imagine:
import…

TwinPenguins
- 475
- 9
- 17
1
vote
1 answer
Alternatives of LabelEncoder() for target variable while implementing in a pipeline
I am developing a classification base model. I have used the concept of ColumnTransformer and Pipeline for feature engineering and selection, model selection, and for everything. I wanted to encode my categorical target (dependent) variable to…

Shreejan Shrestha
- 105
- 7
1
vote
1 answer
Label encoding by value counts
I try to do label encoding for my cities. However, I want it to label according to which city is more than others. Let's say;
Oslo has 500 rows
Berlin has 400 rows
Napoli has 300 rows in the dataset
So label encoding will label those cities…

efc07
- 33
- 3
1
vote
1 answer
Label Encoder and Inverse_Transform on SOME Columns
Suppose I have a dataframe like the following
df = pd.DataFrame({'animal': ['Dog', 'Bird', 'Dog', 'Cat'],
'color': ['Black', 'Blue', 'Brown', 'Black'],
'age': [1, 10, 3, 6],
…

user15160039
- 37
- 4
1
vote
1 answer
Feature selection and categorical variables
I work on a dataset which contain mainly binary variables. However two of the are categorical with multiple values (strings). I want to apply feature selection using lasso but i have an error Keyerror: could not convert string to float:
Should i use…

Gvasiles
- 85
- 6
1
vote
0 answers
Dask-ml LabelEncoder.fit_tranform() threw AttributeError: 'bool' object has no attribute 'astype'
So I tried to apply LabelEncoder() function to columns that have object dtype on my Dask dataframe:
le = dm.LabelEncoder() #dm is dask-ml module
for column in df.columns:
if df[column].dtype == type(object):
df[column]…

Nendra Haryo
- 11
- 2
1
vote
2 answers
Mapping categorical data from user input to its actual encoded value for prediction
A portion of my dataset looks like this (there are many other processor types in my actual data)
df.head(4)
Processor Task Difficulty Time
i3 34 3 6
i7 34 3 4
i3 50 1 6
i5 25 2…

sebin
- 63
- 3
1
vote
2 answers
Label Encoding using weights for string nominal variables for random forest classification
I have NYC 311 complaint dataset. I want to build a random forest classifier which will take categorical input features about a complaint and will determine the complaint type.
Following are the input feature of a given complaint record
X =…

Sujit Desai
- 21
- 2
1
vote
0 answers
return array(a, dtype, copy=False, order=order) ValueError: could not convert string to float: 'STRING' when building machine leaning model
I'm getting the following error: return array(a, dtype, copy=False, order=order)
ValueError: could not convert string to float: 'BOX72'(BOX72 is a value under column5).
The error seems to come at the line with code impute_knn.fit_transform(X)
Here…

J. Doe
- 269
- 1
- 8