0

For context, I am taking Ad listing data for Machines and using it to predict the type of Machine.

I have used the RandomForestClassifier for class prediction. In the model I have used LabelEncoder to convert all categorical variables, including the feature label (for example, 'Excavator' becomes '5'). After running the model successfully, I am left with my array of predicted values. These values are the encoded values - numerical. What I would like to do now is convert these predictions back into their original strings. E.g. I would like to map the number 5 back to it's original value of 'Excavator' - ideally mapping all of the predicted values in one DataFrame.

I have left out a lot of code below as I don't want to drown people in the full script so I have just left what I deem to be most relevant to my question but if you need to see more in order to help then please let me know!

### ENCODE TO CATEGORICAL ###

# Encoding categorical variables
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()

# Choose columns to encode
cols = ['make', 'model_of_Ad', 'year_manufactured', 'business', "tag_name_deep"]

# Encode columns
df[cols] = df[cols].apply(LabelEncoder().fit_transform)

# Reset df index
df.reset_index(drop=True, inplace=True)

....

from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics

# define the model
rf = RandomForestClassifier()

# fit the model on the whole dataset
rf.fit(X_train, y_train)

#Predict on the test set in order to assess accuracy
y_pred = rf.predict(X_test)

# Model Accuracy, how often is the classifier correct?
print("Accuracy:", metrics.accuracy_score(y_test, y_pred))

# See predicted values
print(y_pred)

Any help is appreciated!

jackyg
  • 11
  • 1
  • 3
    Have you tried inverse transform? For example `list(le.inverse_transform(y_pred))` – im_vutu May 25 '22 at 01:40
  • I get this error when i do - NotFittedError: This LabelEncoder instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator. – jackyg May 25 '22 at 04:40
  • Try rewrite `df[cols] = df[cols].apply(LabelEncoder().fit_transform)` to `df[cols] = df[cols].apply(le.fit_transform)` – im_vutu May 25 '22 at 05:42

0 Answers0