I'm a beginner trying to create a predictive model using Random Forest in Python, using train and test datasets. train["ALLOW/BLOCK"] can take 1 out of 4 expected values (all strings). test["ALLOW/BLOCK"] is what needs to be predicted.
y,_ = pd.factorize(train["ALLOW/BLOCK"])
y
Out[293]: array([0, 1, 0, ..., 1, 0, 2], dtype=int64)
I used predict
for the prediction.
clf.predict(test[features])
clf.predict(test[features])[0:10]
Out[294]: array([0, 0, 0, 0, 0, 2, 2, 0, 0, 0], dtype=int64)
How can I get the original values instead of the numeric ones? Is the following code actually comparing the actual and predicted values?
z,_= pd.factorize(test["AUDIT/BLOCK"])
z==clf.predict(test[features])
Out[296]: array([ True, False, False, ..., False, False, False], dtype=bool)