I have learned about support vector machines, and gotten an equation for the cost function: J(M,D) = C/2 |w|^2 + sum((D_hinge(x,y,M)) moreover, I understand that when you take the gradient of this and set it equal to 0, then solve for w, you get : w = 1/(C * Dtra) * sum for (x,y) in support vectors of: xy. Basically, you get a value for w in terms of its support vectors. What i dont understand is: what is the purpose of this? is this calculated value for w the minimum? If not, how do you use this formula for w? Sorry if the equations are confusing, there's no math mode on stackoverflow yet:(
Asked
Active
Viewed 27 times
1
-
No one helped me :((((. To any late night googlers stuck on this one, I figured it out. 1.Is this the calculated value for w at the minimum? A1: Yes, but you cannot use this formula to FIND x because, without knowing what w is already, you wont know which data points lie on the margin or within, i.e which support vectors to sum up in the first place. 2. How do you use this formula for w? A2: After performing stochastic gradient descent on the SVM cost function (as with any Supervised Learning model), you will have an optimal value for w, which will define your optimal support vectors.(1/2) – otj202 Feb 23 '20 at 18:09
-
Instead of storing w, you store the support vectors, and each time the model is called to make a classification, recalculate w using these support vectors. STORE W NOWHERE!!!!! If you wonder the use of this, as I did, it is that you can more easily use a model like this to create nonlinear regressions. – otj202 Feb 23 '20 at 18:14