Radial SVM Classifcation wtih Cross Validation and Tuning in R

Question

I am in the process of creating a Radial SVM Classification model and I would to perform 5-fold CV on it and tune it. I have seen how others do it here and followed these instructions. However, my code does not want to implement my tuning grid. Also, I do not understand why I cannot get Accuracy or an F1 value when I train the model explicitly.

With 5-fold CV

library(caret)
set.seed(500)
ctrl <- trainControl(method = "repeatedcv",
                      number = 5,
                      repeats = 3, 
                      classProb=T,
                      summaryFunction = twoClassSummary
                     )
sigma<-c(2^-15,2^-13,2^-11,2^-9,2^-7,2^-5,2^-3,2^-1,2^1,2^2,2^3)
C<-c(2^-5,2^-3,2^-1,2^1,2^2,2^3,2^5,2^7,2^9,2^11,2^13)
tuninggrid<-data.frame(expand.grid(sigma,C))

mod <- train(x = iris[-5], y=iris$Species,
             method = "svmRadial", 
             trControl = ctrl,
             metric=c('ROC'),
             tunegrid=tuninggrid

The results are simply sigma was held constant. Why does it not use my tuning grid?

Secondly, when I adjust the metric from 'ROC' to 'Accuracy', it says Accuracy is not available. This I understand is because of my summaryFunction in trainControl. If I remove it, then I can get Accuracy, but not ROC. Ultimately, I would like both and an F1 value, but I cannot find documentation on this. How would I write something to give me both at the same time?

Lastly, the output from train(). To get the weights, it is just using mod$finalModel@coef correct?

At first blush have you looked at str(mod) and summary(mod) ? — meh, May 10 '19 at 15:31
Also, my recollection of how this works is that you have requested that the CV be done 5 times, not that you have created 5-fold CV. — meh, May 10 '19 at 15:32
If you don't get an answer I'll try and give more details later. However, if you google the relevant terms you will find a lot of caret documentation exists online. Honestly that is, for your own edification, the best procedure. — meh, May 10 '19 at 15:48
As much as I have reviewed it, I am still troubling to understand it. — Jack Armstrong, May 10 '19 at 15:49
I suggest you read [this](http://topepo.github.io/caret/model-training-and-tuning.html#model-training-and-parameter-tuning) and all your questions should be answered. If still in doubt post an update to the question. — missuse, May 12 '19 at 12:26
I updated the question after doing the reading on it to make it more clear and specific. Not all of my questions are answered but heading in the right direction. — Jack Armstrong, May 19 '19 at 09:39

Pierre Gramme · Accepted Answer · 2019-05-23T16:01:28.400

2

There are a few small errors in your code:

If you want to use area under the ROC as metric, you need to specify twoClassSummary as you did, but your response variable should also be binary. For example:
```
train(..., y = factor(ifelse(iris$Species=="setosa", "setosa", "other")), ...)
```
If you want to use accuracy as metric, use defaultSummary instead of twoClassSummary
If you View(tuninggrid) you will see that its column names are Var1 and Var2, whereas they should be C and sigma. You can fix its definition:
```
tuninggrid <- expand.grid(sigma=sigma,C=C)
```
There is a typo in the call to train(...): correct argument name is tuneGrid (R is case sensitive)

Fixing these will solve your problem: View(mod$results)

EDIT: if you want to optimize the accuracy (computed in defaultSummary) but also display the AUROC (from twoClassSummary) and/or F measure (from prSummary), you can define your own metric function which combines all and use it in trainControl:

combinedSummary <- function(data, lev = NULL, model = NULL) {
  c(
    defaultSummary(data, lev, model),
    twoClassSummary(data, lev, model), 
    prSummary(data, lev, model)
    )
}

edited May 23 '19 at 16:01

answered May 20 '19 at 16:58

Pierre Gramme

1,209
7
23

How would you get Accuracy and ROC and F1 all at once though? Or can you only do one or the other? – Jack Armstrong May 21 '19 at 15:07
The objective function that you optimize should be one-dimensional. Otherwise, how would you decide which option is best between (Acc=0.81, AUC=0.95, F1=0.87) and (Acc=0.95, AUC=0.87, F1=0.81) ? So the best is to choose one or the other, and if really necessary you can still build your own performance metric. – Pierre Gramme May 22 '19 at 17:04
Okay. That makes sense. So if I wanted to maximize accuracy I would not define the twoClassSummary part. But then I am assuming there must be ways to get the AUC and F1? – Jack Armstrong May 23 '19 at 15:31
Now I understand your comment better... I've edited the answer – Pierre Gramme May 23 '19 at 16:02
To follow up, the metric that I select in train is 'AUC', not 'ROC' and within `View(mod$results)`, I assume I am using the AUC value, not the ROC value that it produces? – Jack Armstrong Jun 10 '19 at 18:27
Indeed, `metric='ROC'` computes and optimises the AUC (which is the area under the ROC curve) – Pierre Gramme Jun 11 '19 at 06:56
I tested both, as in with `metric='ROC'` and then `'AUC'` and got the same results after I posted this, just wanted to confirm. – Jack Armstrong Jun 11 '19 at 15:52
Nice, thanks: I didn't know you could use value 'AUC' as alias for 'ROC' – Pierre Gramme Jun 12 '19 at 07:37

Radial SVM Classifcation wtih Cross Validation and Tuning in R

1 Answers1