What is the difference between gradient descent and gradient ascent?

Question

I am not able to find anything about gradient ascent. Any good link about gradient ascent demonstrating how it is different from gradient descent would help.

Gradient descent is like dropping a marble into an oddly shaped bowl, wheras gradient ascent is releasing a lighter than air balloon inside an oddly shaped dome-tent. The difference is only in where marble/balloon is nudged, and where it ultimately stops moving. Here is a working example of gradient descent written in GNU Octave: https://github.com/schneems/Octave/blob/master/mlclass-ex4/mlclass-ex4/fmincg.m — Eric Leschinski, Apr 26 '21 at 13:40
Gradient descent solves a minimization problem. Change the sign, make it a maximization problem, and now you're using gradient ascent. — duffymo, Apr 26 '21 at 13:52
Gradient descent is an iterative operation that creates the shape of your function (like a surface) and moves the positions of all input variables until the model converges on the optimum answer. "The Gradient" is "the set of all partial derivatives describing the slope of the surface against the current point". A blind man can climb a mountain if he "Takes a step up" until you can't anymore. Pursue a masters degree in CS and ML and this will be coursework. — Eric Leschinski, Apr 26 '21 at 14:25

score 30 · Answer 1 · answered Mar 23 '14 at 17:29

30

It is not different. Gradient ascent is just the process of maximizing, instead of minimizing, a loss function. Everything else is entirely the same. Ascent for some loss function, you could say, is like gradient descent on the negative of that loss function.

answered Mar 23 '14 at 17:29

Sean Owen

66,182
23
141
173

7

Basically in gradient descend you're minimizing errors whereas in gradient ascend you're maximizing profit – Kiril Stanoev May 17 '17 at 16:31
What do you mean by profit here? – tbhaxor May 31 '23 at 20:11

score 24 · Answer 2 · answered Jul 14 '15 at 23:38

Typically, you'd use gradient ascent to maximize a likelihood function, and gradient descent to minimize a cost function. Both gradient descent and ascent are practically the same. Let me give you an concrete example using a simple gradient-based optimization friendly algorithm with a concav/convex likelihood/cost function: logistic regression.

Unfortunately, SO still doesn't seem to support LaTeX, so let me post a few screenshots.

The likelihood function that you want to maximize in logistic regression is

enter image description here

where "phi" is simply the sigmoid function

enter image description here

Now, you want to a concav funcion for gradient ascent, thus take the log:

enter image description here

Similarly, you can just write it as its inverse to get the cost function that you can minimize via gradient descent.

enter image description here

For the log-likelihood, you'd derive and apply the gradient ascent as follows:

enter image description here

Since you'd want to update all weights simultaneously, let's write it as

enter image description here

Now, it should be quite obvious to see that the gradient descent update is the same as the gradient ascent one, only keep in mind that we are formulating it as "taking a step into the opposite direction of the gradient of the cost function"

enter image description here

Hope that answers your question!

I have asked something related to this here: http://stats.stackexchange.com/questions/261692/gradient-descent-for-logistic-regression-with-or-without-f-prime — VansFannel, Feb 14 '17 at 08:34

score 8 · Answer 3 · edited Apr 26 '21 at 13:46

8

Gradient Descent is used to minimize a particular function whereas gradient ascent is used to maximize a function.

Check this out http://pandamatak.com/people/anand/771/html/node33.html

edited Apr 26 '21 at 13:46

Eric Leschinski

146,994
96
417
335

answered Mar 23 '14 at 17:09

vishalg

131
1
11

score 4 · Answer 4 · answered Feb 03 '20 at 10:00

gradient ascent is maximizing of the function so as to achieve better optimization used in reinforcement learning it gives upward slope or increasing graph.

gradient descent is minimizing the cost function used in linear regression it provides a downward or decreasing slope of cost function.

score 3 · Answer 5 · answered Nov 27 '18 at 06:40

If you want to minimize a function, we use Gradient Descent. For eg. in Deep learning we want to minimize the loss function hence we use Gradient Descent.

If you want to maximize a function, we use Gradient Ascent. For eg. in Reinforcement Learning - Policy Gradient methods our goal is to maximize the reward/expected return function hence we use Gradient Ascent.

score 0 · Answer 6 · answered Dec 12 '16 at 15:51

Gradient is another word for slope. The positive gradient of the graph at a point (x,y) means that the graph slopes upwards at a point (x,y). On the other hand, the negative gradient of the graph at a point (x,y) means that the graph slopes downwards at a point (x,y).

Gradient descent is an iterative algorithm which is used to find a set of theta that minimizes the value of a cost function. Therefore, gradient ascent would produce a set of theta that maximizes the value of a cost function.

What is the difference between gradient descent and gradient ascent?

6 Answers6