7

How to get an easy explanation of Coordinate descent and subgradient solution in context of lasso.

An intuitive explanation followed by proof will be helpful.

SwiftArchitect
  • 47,376
  • 28
  • 140
  • 179
shan
  • 553
  • 2
  • 9
  • 25

1 Answers1

4

Suppose you have a multivariate function F(W) with K number of variables/parameters w (w_1, w_2, w_3, ..., w_k). The parameters are the knobs and the goal is to change these knobs in a way that F is minimized the function F. Coordinate descent is a greedy method the sense that on each iteration you change the values of parameters w_i to minimize F. It is very easy to implement and like gradient descent it is guaranteed to minimize F on each iteration and reach a local minima.

enter image description here

Picture borrowed from the Internet through a Bing image search

As shown in the picture above, the function F has two parameters x and y. On each iteration either both of the parameters are changed by a fixed value c and the value of the function is evaluated at the new point. If the value is higher and the goal is to minimize the function, the change is reversed for the selected parameter. Then the same procedure is done for the second parameter. This is one iteration of the algorithm.

An advantage of using coordinate descent is in the problems where computing the gradient of the function is a expensive.

Sources

Amir
  • 10,600
  • 9
  • 48
  • 75
  • Thanx. I have heard that where computing gradient is expensive, subgradient is computed to get a solution, eg absolute value function. Wondering if I can get an intuitive explanation of computing subgradients ? – shan Jan 17 '16 at 12:25
  • @shan Well to my knowledge, computing subgradient is infact computing the derivative of a function at a given point. Take a look at [this link ] (https://en.wikipedia.org/wiki/Subderivative) for more information about subgradient. Although it make sense, but I personally do not know what the exact relationship is between coordinate descent and subgradient methods. What I can tell you is that if a function is convex, subgradient is the gradient itself. – Amir Jan 17 '16 at 19:50
  • 2
    "if a function is convex, subgradient is the gradient itself" -- that is incorrect. For example, the function f(x)=|x| is convex, but at x=0 each point in the range [-1,1] is a subgradient. It should be: if the function is differentiable at x0 ==> the only subgradient is the gradient. – Tomer Levinboim Jan 18 '16 at 23:35
  • Thanks all. @Tomer Can you please elaborate the concept of subgradient. If possible an explanation link in detail. – shan Jan 19 '16 at 00:05
  • 1
    See Figures 3 and 1 in the following lecture notes by Boyd and Vandenberghe: https://see.stanford.edu/materials/lsocoee364b/01-subgradients_notes.pdf... – Tomer Levinboim Jan 19 '16 at 00:30