0

I am carrying out Logistic Regression in R and I am trying to predict in my natural disaster dataset the probability of having deaths or not. I am building the confusion matrix in order to evaluate my model and I get this warning for the specific column every time.

Warning message: In confusionMatrix.default(factor(predicted.classes), factor(test.data$deaths), : Levels are not in the same order for reference and data. Refactoring data to match.

When I use another column, the same code runs perfectly fine and I get results. This is my code:

model <- glm(deaths ~., data = train.data, family = binomial)
summary(model)

probabilities <- model %>% predict(test.data, type = "response")
predicted.classes <- ifelse(probabilities > 0.5, 1, 0)

result <- caret::confusionMatrix(as.factor(predicted.classes), as.factor(test.data$deaths), positive = "1")

However with the aforementioned variable these are my results:

               Accuracy : 0.9826          
                 95% CI : (0.9796, 0.9853)
    No Information Rate : 0.9826          
    P-Value [Acc > NIR] : 0.522           
                                          
                  Kappa : 0               
                                          
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.00000         
            Specificity : 1.00000         
         Pos Pred Value :     NaN         
         Neg Pred Value : 0.98262         
             Prevalence : 0.01738         
         Detection Rate : 0.00000         
   Detection Prevalence : 0.00000         
      Balanced Accuracy : 0.50000         
                                          
       'Positive' Class : 1               

Is logistic regression not predicting correctly, or is the model that bad and the regression is malfunctioning?

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • 1
    Searching the error message among questions tagged with **r** ( https://stackoverflow.com/search?q=%5Br%5D+Levels+are+not+in+the+same+order+for+reference+and+data.+Refactoring+data+to+match ) turned up e.g. this answer which might solve the issue: https://stackoverflow.com/questions/32077843/confusionmatrix-for-a-classifier-in-r –  May 12 '22 at 20:23
  • You may need to add the "levels" option to the as.factor calls for a consistent ordering. See if this makes the warnings stop: `confusionMatrix(as.factor(predicted.classes, levels = c(0, 1)), as.factor(test.data$deaths, levels = c(0, 1)), positive = "1")` – Dave2e May 12 '22 at 22:31
  • I added the "levels" option but I still receive the same warning and results. – elena pap May 13 '22 at 10:15
  • Without a having a sample of your data in order to reproduce the issue, it was just a guess. – Dave2e May 13 '22 at 16:31

0 Answers0