0

SVMs works by mapping points to a higher and higher dimension until it can find a boundary which is linear.

Does SVM always succeed in finding a decision boundary which is linear?

Leockl
  • 1,906
  • 5
  • 18
  • 51

1 Answers1

2

First, SVM do not map points to a higher and higher dimension. If linear kernel is used, points are not mapped; for some other kernel, e.g. RBF kernel, they are mapped to an infinite dimensional space.

To your question, I suppose you mean whether SVM with RBF kernel can find a separating hyperplane in the mapped space. It is proven here that with a small enough σ^2 and large enough C, it can always find a separating hyperplane, i.e., the training accuracy is 100%.

hychou
  • 572
  • 5
  • 15
  • Ok many thanks for your answer on SVM with RBF kernel, this was what I was after. Apologies I should have been more clearer. If σ^2 is large and C is small, will SVM with RBF kernel fail or take a really long time to find a separating hyperplane? If so, what other kernel would you recommend in using? – Leockl Feb 06 '20 at 04:15
  • Sorry I do not understand your question. We do not fix σ^2 or C, we find the best σ^2 and C for every particular case, using cross-validation. If your goal is to find a separating hyperplane, RBF kernel always do the job. Just be careful for [overfitting](https://en.wikipedia.org/wiki/Overfitting). – hychou Feb 06 '20 at 05:15
  • If you use Python, sklearn’s SVM lets you choose your own C (and you are right probably cannot choose σ^2). See here: https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html – Leockl Feb 06 '20 at 14:05
  • Correct me if I am wrong, I think RBF kernel would always do they job too, but in some cases can take a very long time to find the separating hyperplane. – Leockl Feb 06 '20 at 14:07
  • γ = 1/2σ^2 in [sklearn.svm](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html) and [libsvm](https://www.csie.ntu.edu.tw/~cjlin/libsvm/). Quoted [wiki](https://en.wikipedia.org/wiki/Radial_basis_function_kernel): An equivalent definition involves a parameter γ = 1/2σ^2. – hychou Feb 07 '20 at 02:00
  • The time SVM with different γ's takes to solve the same dataset shouldn't be that different IMO. I mean if you can find a solution at large γ (hence a linear kernel), then it shouldn't take arbitrarily long time to find a solution at small γ (hence a powerful RBF), by my experience. – hychou Feb 07 '20 at 02:16