8

I'm trying to apply feature selection (e.g. recursive feature selection) in SVM, using the R package. I've installed Weka which supports feature selection in LibSVM but I haven't found any example for the syntax of SVM or anything similar. A short example would be of a great help.

Ofer Rahat
  • 790
  • 1
  • 9
  • 15

1 Answers1

16

The function rfe in the caret package performs recursive feature selection for various algorithms. Here's an example from the caret documentation:

library(caret)
data(BloodBrain, package="caret")
x <- scale(bbbDescr[,-nearZeroVar(bbbDescr)])
x <- x[, -findCorrelation(cor(x), .8)]
x <- as.data.frame(x)
svmProfile <- rfe(x, logBBB,
                  sizes = c(2, 5, 10, 20),
                  rfeControl = rfeControl(functions = caretFuncs,
                                          number = 200),
                  ## pass options to train()
                  method = "svmRadial")

# Here's what your results look like (this can take some time)
> svmProfile

Recursive feature selection

Outer resampling method: Bootstrap (200 reps) 

Resampling performance over subset size:

  Variables   RMSE Rsquared  RMSESD RsquaredSD Selected
2 0.6106   0.4013 0.05581    0.08162         
5 0.5689   0.4777 0.05305    0.07665         
10 0.5510   0.5086 0.05253    0.07222         
20 0.5203   0.5628 0.04892    0.06721         
71 0.5202   0.5630 0.04911    0.06703        *

  The top 5 variables (out of 71):
  fpsa3, tcsa, prx, tcpa, most_positive_charge
David Marx
  • 8,172
  • 3
  • 45
  • 66
  • What is `sizes = c(2, 5, 10, 20)` in here? Does that mean features 2, 10 and 20? – Mahsolid Apr 28 '16 at 22:30
  • @Mahsolid No, it's the count of features that will be used. rfe will try to find the best model of each size given in that vector. Check the rfe docs for more details. – David Marx May 04 '16 at 19:07
  • @DavidMarx Thanks for your explanation. What is the meaning of `number = 200` in the `rfe()` function call? – DavideChicco.it Feb 22 '19 at 16:30