Does SVM classification always produces unique solution?

Question

Linear classifiers on separable can have more than one boundary for classifying the data. This is the reason we go for SVM to choose boundary which has maximum margin(minimum generalization error on unseen data).

Does SVM classification always produces unique solution(Wont we get two maximum margin boundary in all possible data) ?

Is answer depends on the Hard margin SVM and Soft Margin SVM?

You could reframe the question a bit. Does the SMO (sequential minimal optimization) algorithm guarantee a globally optimal result? Wikipedia was not much help in that regard... — Craig Wright, Sep 26 '12 at 17:36
SMO does not have anything to do with whether there exists a global optimum. That is a property of the optimization problem itself (e.g. whether it is convex or not). SMO is just a numerical means of obtaining some solution given an objective function and constraints. You could feed SMO a non-convex problem with many local optima and then not be assured of anything about the solution it finds. — ely, Sep 27 '12 at 11:31

ely · Accepted Answer · 2012-10-24T15:24:04.473

14

Yes, both the soft and hard formulations of standard SVM are convex optimization problems, hence have unique global optima. I suppose if the problem is incredibly huge, approximation methods would be parsimonious enough that you would use them instead of exact solvers, and then your numerical solution technique might not find the global optimum purely because it's trade-off benefit is to reduce search time.

The typical approach to these is sequential minimal optimization -- hold some variables fixed and optimize over a small subset of the variables, then repeat with different variables over and over until you can't improve the objective function. Given that, I find it implausible that anyone would solve these problems in a way that won't yield the global optimum.

Of course, the global optimum you find might not actually be appropriate for your data; that depends on how well your model, noisy class labels, etc. represent the data generating process. So solving this doesn't guarantee you've found the absolute right classifier or anything.

Here are some lecture notes I found about this in a cursory search: (link)

Here is a more direct link regarding the convexity claims: (link)

edited Oct 24 '12 at 15:24

answered Sep 26 '12 at 21:10

ely

74,674
34
147
228

4

Convex optimization problems do not have unique solutions. You need strict convexity for that. – DavidR Feb 09 '17 at 02:16
This seems irrelevant. – ely Feb 09 '17 at 18:35
2

For convex problems, every local minimum is a global minimum, but there may be multiple minimizers (e.g. the function f(x)=0 is convex and is minimized everywhere). With strictly convex, any local minimizer is also the unique global minimizer. But even strictly convex objective functions may not have a minimizer at all, e.g. f(x)=1/x. For hard margin SVM, if we only have data from one class, there's no solution. (Though otherwise exists and is unique if data are separable.) For soft-margin, if there's an unregularized bias b, you can get multiple solutions. – DavidR Feb 09 '17 at 21:07
These considerations are mathematically valid, just not relevant to the practical question at hand. – ely Feb 09 '17 at 22:51
1

Sorry - I couldn't tell that this was a practical question. I was searching for an answer to the theoretical question (which could have been written the same way), and was frustrated that I wasn't finding it. – DavidR Feb 10 '17 at 16:07
No worries, glad to have your points documented in the comments. A more theoretical answer would probably belong at mathoverflow or math.stackexchange, generally. – ely Feb 10 '17 at 16:36
First link is broken, FYI. – Quinn Culver Nov 24 '19 at 20:04

score 4 · Answer 2 · answered Sep 26 '12 at 20:27

For hard margin classifiers with no regularization the SVM problem can be converted to a coercive quadratic programming problem with linear constrains (assuming a solution / positive margin exists). Coercive quadratic programming problems with linear constrains have unique global minimums and simple optimization methods (like gradient decent or the perceptron algorithm) are guaranteed to converge to the global minimum. See for example

http://optimization-online.org/DB_FILE/2007/05/1662.pdf

For soft margin SVMs and for SVMs with regularization terms, I think there are unique global minima and the usual techniques converge to the global minimum, but I am not aware of any proofs that cover all the possibilities.

Does SVM classification always produces unique solution?

2 Answers2