I am trying to compute the confusion matrix of a multi-classification problem of a very big data frame, which is divided and scaled as Train_Scale and Test_Scale (scales of Train set are used for scaling Test) sets.
Ranger was used to do modelling:
set.seed(123)
library(ranger)
library(caret)
Class.ranger <- ranger(Class~., data = Train_Scale, num.trees = 5000, importance = "impurity", save.memory = TRUE, probability = TRUE)
The variable Class has 5 levels:
str(Test_Scale$Class)
Factor w/ 5 levels "A","B",..: 5 1 1 1 1 5 5 5 1 1 ...
Validation is done on the test set as follows:
set.seed(123)
probabilitiesClass <- predict(Class.ranger, data = Test_Scale, num.trees = 5000, type='response', verbose = TRUE)
The probabilitiesClass is a List of 5 as shown below:
I get the following error while trying to interpret the results via confusion matrix:
> caret::confusionMatrix(Test_Scale$Class, probabilitiesClass$predictions)
Error: `data` and `reference` should be factors with the same levels.
Should predictions in the figure above must be factor (since it is presently double), and since Class is a factor with 5 levels?
Or, trying to use table (note: there are no NA values appearing either) gives the following error:
table(Test_Scale$Class, probabilitiesClass$predictions)
Error in table(Test_Scale$Class, probabilitiesClass$predictions):
all arguments must have the same length
What is going wrong and how can the confusion matrix be obtained for the multiclass classification using ranger (preferred, since caret interprets only upt0 53 levels?) and caret?