9

I understood the overall SVM algorithm consisting of Lagrangian Duality and all, but I am not able to understand why particularly the Lagrangian multiplier is greater than zero for support vectors.

Thank you.

Neel Shah
  • 349
  • 1
  • 4
  • 12

2 Answers2

10

This might be a late answer but I am putting my understanding here for other visitors.

Lagrangian multiplier, usually denoted by α is a vector of the weights of all the training points as support vectors.

Suppose there are m training examples. Then α is a vector of size m. Now focus on any ith element of α: αi. It is clear that αi captures the weight of the ith training example as a support vector. Higher value of αi means that ith training example holds more importance as a support vector; something like if a prediction is to be made, then that ith training example will be more important in deriving the decision.

Now coming to the OP's concern:

I am not able to understand why particularly the Lagrangian multiplier is greater than zero for support vectors.

It is just a construct. When you say αi=0, it is just that ith training example has zero weight as a support vector. You can instead also say that that ith example is not a support vector.

Side note: One of the KKT's conditions is the complementary slackness: αigi(w)=0 for all i. For a support vector, it must lie on the margin which implies that gi(w)=0. Now αi can or cannot be zero; anyway it is satisfying the complementary slackness condition. For αi=0, you can choose whether you want to call such points a support vector or not based on the discussion given above. But for a non-support vector, αi must be zero for satisfying the complementary slackness as gi(w) is not zero.

Ankit Shubham
  • 2,989
  • 2
  • 36
  • 61
0

I can't figure this out too...

If we take a simple example, say of 3 data points, 2 of positive class (yi=1): (1,2) (3,1) and one negative (yi=-1): (-1,-1) - and we calculate using Lagrange multipliers, we will get a perfect w (0.25,0.5) and b = -0.25, but one of our alphas was negative (a1 = 6/32, a2 = -1/32, a3 = 5/32).

Maverick Meerkat
  • 5,737
  • 3
  • 47
  • 66