Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

Wiki:

Gradient descent is a first-order iterative optimization algorithm. It is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).

To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm.

Gradient descent is also known as steepest descent, or the method of steepest descent.

Tag usage:

Questions on gradient-descent should be about implementation and programming problems, not about the theoretical properties of the optimization algorithm. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

Read more:

1428 questions

votes

2 answers

Behavioral difference between Gradient Desent and Hill Climbing

I'm trying to understand the difference between these two algorithms and how they differ in solving a problem. I have looked at the algorithms and the internals of them. It would be good to hear from others who already experienced with them.…

asked Nov 18 '16 at 08:33

PRCube

votes

4 answers

Implementing gradient descent for multiple variables in Octave using "sum"

I'm doing Andrew Ng's course on Machine Learning and I'm trying to wrap my head around the vectorised implementation of gradient descent for multiple variables which is an optional exercise in the course. This is the algorithm in question (taken…

machine-learning octave gradient-descent

asked Mar 11 '16 at 16:38

Nobilis

7,310
1
33
67

votes

1 answer

Explanation for Coordinate Descent and Subgradient

How to get an easy explanation of Coordinate descent and subgradient solution in context of lasso. An intuitive explanation followed by proof will be helpful.

machine-learning mathematical-optimization gradient-descent

asked Jan 16 '16 at 00:16

shan

votes

2 answers

Full-matrix approach to backpropagation in Artificial Neural Network

I am learning Artificial Neural Network (ANN) recently and have got a code working and running in Python for the same based on mini-batch training. I followed the book of Michael Nilson's Neural Networks and Deep Learning where there is step by step…

python numpy neural-network backpropagation gradient-descent

asked Jul 24 '15 at 04:28

Bishwarup Bhattacharjee

votes

1 answer

Clarification in the Theano tutorial

I am reading this tutorial provided on the home page of Theano documentation I am not sure about the code given under the gradient descent section. I have doubts about the for loop. If you initialize the 'param_update' variable to…

python numpy theano gradient-descent deep-learning

asked Aug 18 '14 at 15:27

Abhishek

3,337
4
32
51

votes

2 answers

Gradient Descent: Do we iterate on ALL of the training set with each step in GD? or Do we change GD for each training set?

I've taught myself machine learning with some online resources but I have a question about gradient descent that I couldn't figure out. The formula for gradient descent is given by the following logistics regression: Repeat { θj =…

machine-learning gradient-descent

asked Jun 24 '13 at 19:52

Terence Chow

10,755
24
78
141

votes

1 answer

Gradient Descent Optimization in CUDA

I will code my first relatively big CUDA project as Gradient Descent Optimization for machine learning purposes. I would like to get benefit from crowd wisdom about some useful native functions of the CUDA that might be short cut to use in the…

cuda gradient-descent

asked May 15 '13 at 13:21

erogol

13,156
33
101
155

votes

1 answer

Get positive and negative part of gradient for loss function in PyTorch

I want to implement non-negative matrix factorization using PyTorch. Here is my initial implement: def nmf(X, k, lr, epochs): # X: input matrix of size (m, n) # k: number of latent factors # lr: learning rate # epochs: number of…

matrix pytorch mathematical-optimization gradient-descent autograd

asked Mar 15 '23 at 09:14

emonhossain

votes

1 answer

Why can't I get the result I got with the sklearn LogisticRegression with the coefficients_sgd method?

from math import exp import numpy as np from sklearn.linear_model import LogisticRegression I used code below from How To Implement Logistic Regression From Scratch in Python def predict(row, coefficients): yhat = coefficients[0] for i in…

python scikit-learn iteration gradient-descent

asked Feb 27 '22 at 09:47

user16386186

votes

1 answer

PyTorch `torch.no_grad` vs `torch.inference_mode`

PyTorch has new functionality torch.inference_mode as of v1.9 which is "analogous to torch.no_grad... Code run under this mode gets better performance by disabling view tracking and version counter bumps." If I am just evaluating my model at test…

machine-learning pytorch artificial-intelligence gradient-descent inference

asked Oct 12 '21 at 16:21

efthimio

votes

0 answers

PyTorch Autograd Differentiated Tensors appears to not have been used in the graph

I'm trying to improve a CNN I made by implementing a weighted loss method described in this paper. To do this, I looked into this notebook which implements the pseudo-code of the method described in the paper. When translating their code to my…

python pytorch conv-neural-network gradient-descent autograd

asked Apr 07 '21 at 23:09

Nakul Upadhya

votes

1 answer

What is the default batch size of pytorch SGD?

What does pytorch SGD do if I feed the whole data and do not specify the batch size? I don't see any "stochastic" or "randomness" in the case. For example, in the following simple code, I feed the whole data (x,y) into a model. optimizer =…

machine-learning deep-learning pytorch gradient-descent stochastic-gradient

asked Feb 05 '20 at 02:13

Tony B

votes

1 answer

TensorFlow: How can I inspect gradients and weights in eager execution?

I am using TensorFlow 1.12 in eager execution, and I want to inspect the values of my gradients and my weights at different points during training for debugging purposes. This answer uses TensorBoard to get nice graphs of weight and gradient…

tensorflow gradient-descent eager-execution

asked Sep 08 '19 at 23:55

user4028648

votes

3 answers

Tensorflow, Keras: How to create a trainable variable that only update in specific positions?

For example, y=Ax where A is an diagonal matrix, with its trainable weights (w1, w2, w3) on the diagonal. A = [w1 ... ... ... w2 ... ... ... w3] How to create such trainable A in Tensorflow or Keras? If I try A = tf.Variable(np.eye(3)),…

python tensorflow keras gradient-descent tensor

asked Aug 08 '18 at 11:28

null

1,167
1
12
30

votes

2 answers

Escaping local minimum with tensorflow

I am solving this system of equations with tensorflow: f1 = y - x*x = 0 f2 = x - (y - 2)*(y - 2) + 1.1 = 0 If I choose bad starting point (x,y)=(-1.3,2), then I get into local minima optimising f1^2+f2^2 with this code: f1 = y - x*x f2 = x - (y -…

python tensorflow gradient-descent equation-solving nonlinear-optimization

asked May 29 '18 at 01:54

Stepan Yakovenko

8,670
28
113
206

Prev 1 2 3

…

95 96 Next