1

Rattle in macOS Cartalina 10.15.6 gives error message: "The supplied actual and predicted must have the same levels." when evaluating model performance using boost method under evaluate tab.

Error message from r console: Error in rattle::errorMatrix(crs$dataset[crs$test, c(crs$input, crs$target)]$TFC_churn, : The supplied actual and predicted must have the same levels.

How to rectify?

matrix

Log code:

enter image description here

hagewhy
  • 79
  • 1
  • 1
  • 6

1 Answers1

0

This might have to do with your input data. Not exactly clear, what you are doing (maybe add soe additional info) but:

The supplied actual and predicted must have the same levels.

This is a typical error message, when your test dataset does not match the train dataset.

Probably test and train have the same variables, but with factor variables it is often also required that the factor levels match each other.

For example:

train:

 gender: factor -   male/female

test:

gender: factor - male/female/unknown

Could give you problems, since in the test set there is suddenly a new factor level, the model has not seen before and does not know what to do with it.

Often it is enough to just add this level to the variable of the train set. (even if there is no instance of it)

Steffen Moritz
  • 7,277
  • 11
  • 36
  • 55
  • it is very strange as i only have 1 target variable on churn or not (which is categorical) throughout. :S The error matrix produces an error only under the boost model, but is working perfectly fine under the other models - e.g. tree/random forest. I wonder if i need to install any additional packages to go with my rattle r package in order to rectify this? It is pretty frustrating that the error matrix cannot work consistently across all models built as i cant assess the performance across all. – hagewhy Sep 01 '20 at 04:56
  • Strange ... it been ages, since I used Rattle the last time, wasn't there some kind of log/export that you can use to export the R code you actually triggered via the GUI? Maybe would be a good idea to share it in the question. Otherwise it is kind of hard to guess what might be wrong. – Steffen Moritz Sep 01 '20 at 05:07
  • #Obtain the response from the Extreme Boost model. crs$pr <- predict(crs$ada, newdata=crs$dataset[crs$test, c(crs$input, crs$target)]) #Generate the confusion matrix showing counts. rattle::errorMatrix(crs$dataset[crs$test, c(crs$input, crs$target)]$TFC_churn, crs$pr, count=TRUE) #Generate the confusion matrix showing proportions. (per <- rattle::errorMatrix(crs$dataset[crs$test, c(crs$input, crs$target)]$TFC_churn, crs$pr)) #Calculate the overall error percentage. cat(100-sum(diag(per), na.rm=TRUE)) #Calculate the averaged class error percentage. cat(mean(per[,"Error"], na.rm=TRUE)) – hagewhy Sep 01 '20 at 05:34
  • Put it in the initial question so everybody can see it easily :) – Steffen Moritz Sep 01 '20 at 05:38
  • Just a small hint: people here prefer to have the code as text not as a picture. (there are options in the editor, that you can format the text as code - then it looks also quite good). Pity that no other people gave their opinion here, guess your questions is rather specific. If nobody answers you can also try to set a bounty on your question. Or try to add a minimal reproducible example that produces the error (which people could run on their own computers). – Steffen Moritz Sep 01 '20 at 18:39