1

Description:

  1. For a data set, I would like to apply SVM by using radial basis function (RBF) kernel with Weston, Watkins native multi-class.
  2. The rbf kernel parameter sigma must be tuned and I want to use k-folds cross validation to do this. I consider a fixed C.

Solution:

It seems that I can use the nice package mlr to do this! So, to tune the rbf parameter sigma using CV for MSVM classification, (using this tutorial)

#While C is fix = 3, define a range to search sigma over it. Search between [10^{-6}, 10^{6}]
num_ps = makeParamSet(
  makeDiscreteParam("C", values = 3),
  makeNumericParam("sigma", lower = -6, upper = 6, trafo = function(x) 10^x)
)
#Define the Grid search method
ctrl = makeTuneControlGrid()
#Apply the k-folds CV
rdesc = makeResampleDesc("CV", iters = 3L)

res = tuneParams("classif.ksvm", task = iris.task, resampling = rdesc,
  par.set = num_ps, control = ctrl)

Question:

For this part

res = tuneParams("classif.ksvm", task = iris.task, resampling = rdesc,
      par.set = num_ps, control = ctrl)

According to the documentation, by using the integrated learner classif.ksvm, I'm asking to apply the multiclass classification that is defined in the package ksvm.

How can I know which method and kernel type are used? I mean, how to force the learner classif.ksvm to use the classification type (kbb-svc) and the kernel (rbfdot ) which are already defined in ksvm?

If this is not possible, then how to define a new learner with all of my requirements?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
adam
  • 43
  • 5

1 Answers1

4

You have to set the fixed parameters within the learner. Therefore you first have to create it:

library(mlr)
lrn = makeLearner("classif.ksvm", par.vals = list(C = 3, type = "kbb-svc", kernel = "rbfdot"))

Then you only define the parameters that you want to change within the ParamSet

num_ps = makeParamSet(
  makeNumericParam("sigma", lower = -6, upper = 6, trafo = function(x) 10^x)
)

Then you can do the tuning as in your example

ctrl = makeTuneControlGrid()
rdesc = makeResampleDesc("CV", iters = 3L)
res = tuneParams(lrn, task = iris.task, resampling = rdesc, par.set = num_ps, control = ctrl)
jakob-r
  • 6,824
  • 3
  • 29
  • 47
  • Thanks a lot for your answer. I just have another question please! Normally when tuning the parameter sigma with k-folds CV using other packages like caret, we use only the training set. Here using mlr package, do I need to provide only the training set or as in my example, I provide the whole set"IRIS" and the function tuneParams() will handle this and divide the set? – adam Mar 04 '21 at 11:00
  • I mean tuneParams() will tune sigma over the whole set IRIS or it will divide the set firstly and then will tune sigma on the training only? – adam Mar 04 '21 at 11:02
  • 3
    It will do the split according to `rdesc` on the whole data (so a 3-fold cv here). This will give you the "optimal" parameters. If you are interested in an unbiased estimate of the tuning performance you have to do nested resampling using `makeTuneWrapper`. – jakob-r Mar 04 '21 at 11:13