3

I'am trying to use svm from e1071 and before going too far with heavy data I intended to play with toy examples.

Here's what I am doing, and I don't understand why it obviously doesn't work.

# generate some silly 2D data
X = data.frame(x1 = runif(10), x2 = runif(10))
# attach a label according to position above/below diagonal x+y=1
X$y <- rep(1, 10)
X$y[(X$x1 + X$x2)<1] = -1
X$y <- factor(X$y)
# train svm model
require(e1071)
meta <- svm(y~., data = X, kernel = "linear", scale = FALSE)
# visualize the result
plot(meta, X)

plot.svm

So from this point on the graph error is already visible because there are some misclassified points and classifier is not the one I'm expecting (all vectors are supports especially).

If I want to predict then, it's wrong too:

predict(meta, newdata = X[,-3])==X$y
[1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE

If I want to do manual prediction I can't get it working too:

omega <- t(meta$coefs)%*%meta$SV
pred <- c(-sign(omega%*%t(X[,-3]) - meta$rho))
pred==X$y
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE

I'm sure there is something I am missing, but can't figure out what !

ClementWalter
  • 4,814
  • 1
  • 32
  • 54

1 Answers1

3

I think there are two separate problems here, your model and your plot. The model is easy to solve, but the plot is more confusing.

Too many support vectors and incorrect predictions

SVM usually works with scaled inputs (mean=0, sd=1). See this explanation of why SVM takes scaled inputs.

You can either scale your inputs first, using the base R scale function or set scale=TRUE when calling svm. I suggest scaling manually, for better control:

X <- as.data.frame(scale(data.frame(x1 = runif(10), x2 = runif(10))))
X$y <- rep(1, 10)
X$y[(X$x1 + X$x2)<0] <- -1
X$y <- factor(X$y)
require(e1071)
meta <- svm(y~., data = X, kernel = "linear")

You should now have a sensible number of support vectors:

meta

  Call:
  svm(formula = y ~ ., data = X, kernel = "linear")


  Parameters:
     SVM-Type:  C-classification 
   SVM-Kernel:  linear 
         cost:  1 
        gamma:  0.5 

  Number of Support Vectors:  4

Predictions should now also be perfect:

predict(meta, newdata = X[,-3])==X$y
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Plotting the SVM

I still have the same problem as you when I plot the SVM, though: several "x" and "o" labels lie on the wrong side of the decision boundary.

Yet if I plot it manually using ggplot, the results look correct:

plotgrid <- expand.grid(seq(-2, 2, 0.1), seq(-2, 2, 0.1))
names(plotgrid) <- c("x1", "x2")
plotgrid$y <- predict(meta, newdata=plotgrid)
library(ggplot2)
ggplot(plotgrid) +
    geom_point(aes(x1, x2, colour=y)) +
    geom_text(data=X, aes(x1, x2, label=ifelse(y==-1, "O", "X"))) +
    ggtitle("Manual SVM Plot")

Manual SVM Plot

So at least we know the underlying SVM model is correct. Indeed, the decision boundary is plotted correctly by plot.svm (you can confirm this by swapping the x1 and x2 axes in your ggplot call, to match the axis labels that plot.svm uses as default).

The problem appears to be that plot.svm is labelling the points incorrectly. I am not sure why. If anyone knows, please leave a comment and I will update this answer. In the meantime, I hope the ggplot workaround will suffice.

Community
  • 1
  • 1
ajrwhite
  • 7,728
  • 1
  • 11
  • 24
  • thanks for digging into this problem. As far as I know about svm, even though the prediction is right, the svm is wrong: according to your plot there should be 3 support vectors, and the separation should have probably more of less the same slope but a higher intercept – ClementWalter May 11 '16 at 13:11
  • 1
    By the strict "maximal margin" definition of SVM, I think you're right. But have a look at Page 6 of https://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf (the whole paper is worth reading, by the way). I think the LibSVM calculation is more complicated, and involves "slack" variables, hence there are extra support vectors. This is to avoid overfitting and make use of more data. – ajrwhite May 11 '16 at 13:58
  • I got your point; indeed I was thinking that for such an easy example these slacks variables won't change anything – ClementWalter May 11 '16 at 14:22