0

I have made an SVM predictor, which can class samples into one of three groups - "good", "bad" or "ok". However, the test dataset only contains samples classed as "good" or "bad". I'm coming up with an error when I'm trying to use multi_roc, and I'm not sure the best way to solve it. The example I've made is below:

library(tidymodels)
library(mlbench)
library(multiROC)
data(Ionosphere)

# preprocess dataset
Ionosphere <- Ionosphere %>% select(-V1, -V2)

# split into training and test data
ion_split <- initial_split(Ionosphere, prop = 3/5)

ion_train <- training(ion_split)
ion_test <- testing(ion_split) 

# making an artificial third class in the training set for this example
ion_train[,33] <- as.character(ion_train[,33])
ion_train[1:7,33] <- "ok"
ion_train[,33] <- as.factor(ion_train[,33])

# make a recipe
iono_rec <-
  recipe(Class ~ ., data = ion_train)  %>%
  step_normalize(all_predictors()) 

# build the model and workflow
svm_mod <-
  svm_rbf(cost = tune(), rbf_sigma = tune()) %>%
  set_mode("classification") %>%
  set_engine("kernlab")

svm_workflow <- 
      workflow() %>%
      add_recipe(iono_rec) %>%
      add_model(svm_mod)

# run model tuning
set.seed(35)
recipe_res <-
  svm_workflow %>% 
  tune_grid(
    resamples = bootstraps(ion_train, times = 2),
    metrics = metric_set(roc_auc),
    control = control_grid(verbose = TRUE, save_pred = TRUE)
  )

# chose best model, finalise workflow
best_mod <- recipe_res %>% select_best("roc_auc")
final_wf <- finalize_workflow(svm_workflow, best_mod)
final_mod <- final_wf %>% fit(ion_train)

predict_res <- predict(
        final_mod,
        ion_test,
        type = "prob")


results <- predict_res %>% 
    cbind(ion_test$Class) %>%
    dplyr::rename(
        bad_pred_svm = .pred_bad,
        good_pred_svm = .pred_good,
        ok_pred_svm = .pred_ok,
        class = `ion_test$Class`
    ) %>%
    mutate(
        bad_true = ifelse(class == "bad", 1, 0),
        good_true = ifelse(class == "good", 1, 0),
        ok_true = ifelse(class == "ok", 1, 0)
    ) %>%
dplyr::select(-class)

This produces a results dataframe that looks like this:

  bad_pred_svm good_pred_svm ok_pred_svm bad_true good_true ok_true
1   0.01166109    0.92349066  0.06484826        0         1       0
2   0.82937620    0.07576908  0.09485472        1         0       0
3   0.05858563    0.88043189  0.06098248        0         1       0
4   0.91602211    0.04624037  0.03773753        1         0       0
5   0.91841475    0.04407115  0.03751410        1         0       0
6   0.01014520    0.94295540  0.04689940        0         1       0

When I try and put this into multi_roc, I get an error:

multi_roc_svm <- multi_roc(results, force_diag = TRUE)

Error in approx(res_sp[[i]][[j]], res_se[[i]][[j]], all_sp, yleft = 1,  : 
  need at least two non-NA values to interpolate
In addition: Warning messages:
1: In regularize.values(x, y, ties, missing(ties), na.rm = na.rm) :
  collapsing to unique 'x' values
2: In regularize.values(x, y, ties, missing(ties), na.rm = na.rm) :
  collapsing to unique 'x' value

I'm 99% sure this error is because I do not have any samples of "ok" class in my test data frame, but I don't know how to get around this. Could I plot the multi ROC curve by hand?

icedcoffee
  • 935
  • 1
  • 6
  • 18
  • If you can create a [reprex](https://www.tidyverse.org/help/), that will help folks understand the scope and causes of your question and find an answer. That being said, have you tried using [`yardstick::roc_curve()`](https://yardstick.tidymodels.org/reference/roc_curve.html) here? It works for multiclass results. – Julia Silge Apr 21 '21 at 22:25

1 Answers1

0

I don't know what package multi_roc() is in but the tidymodels solution is pretty easy.

If you just want to get the ROC value from the multiclass ROC curve, you can use the yardstick function:

> predict_res %>% 
+     bind_cols(ion_test) %>% 
+     # or roc_curve(Class, .pred_bad)
+     roc_auc(Class, .pred_bad)
# A tibble: 1 x 3
  .metric .estimator .estimate
  <chr>   <chr>          <dbl>
1 roc_auc binary         0.976
topepo
  • 13,534
  • 3
  • 39
  • 52