Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

Wiki:

Gradient descent is a first-order iterative optimization algorithm. It is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).

To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm.

Gradient descent is also known as steepest descent, or the method of steepest descent.

Tag usage:

Questions on gradient-descent should be about implementation and programming problems, not about the theoretical properties of the optimization algorithm. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

Read more:

1428 questions

votes

2 answers

Tensorflow: How to write op with gradient in python?

I would like to write a TensorFlow op in python, but I would like it to be differentiable (to be able to compute a gradient). This question asks how to write an op in python, and the answer suggests using py_func (which has no gradient): Tensorflow:…

asked Aug 19 '16 at 23:31

Alex I

19,689
9
86
158

votes

1 answer

Stochastic gradient descent from gradient descent implementation in R

I have a working implementation of multivariable linear regression using gradient descent in R. I'd like to see if I can use what I have to run a stochastic gradient descent. I'm not sure if this is really inefficient or not. For example, for each…

r gradient-descent stochastic

asked May 27 '16 at 13:46

Daina

votes

1 answer

Machine learning - Linear regression using batch gradient descent

I am trying to implement batch gradient descent on a data set with a single feature and multiple training examples (m). When I try using the normal equation, I get the right answer but the wrong one with this code below which performs batch gradient…

matlab machine-learning gradient linear-regression gradient-descent

asked Aug 28 '15 at 15:20

Sridhar Thiagarajan

votes

3 answers

Is my implementation of stochastic gradient descent correct?

I am trying to develop stochastic gradient descent, but I don't know if it is 100% correct. The cost generated by my stochastic gradient descent algorithm is sometimes very far from the one generated by FMINUC or Batch gradient descent. while batch…

matlab machine-learning logistic-regression gradient-descent

asked Jan 25 '14 at 14:56

alexandrekow

1,927
2
22
40

votes

4 answers

Fast gradient-descent implementation in a C++ library?

I'm looking to run a gradient descent optimization to minimize the cost of an instantiation of variables. My program is very computationally expensive, so I'm looking for a popular library with a fast implementation of GD. What is the recommended…

c++ visual-studio-2010 optimization numerical-methods gradient-descent

asked Jul 16 '12 at 23:12

Jim

4,509
16
50
80

votes

2 answers

Accumulating Gradients

I want to accumulate the gradients before I do a backward pass. So wondering what the right way of doing it is. According to this article it's: model.zero_grad() # Reset gradients tensors for i, (inputs, labels) in…

python pytorch gradient-descent

asked Nov 16 '18 at 04:40

sachinruk

9,571
12
55
86

votes

2 answers

Gradient descent impementation python - contour lines

As a self study exercise I am trying to implement gradient descent on a linear regression problem from scratch and plot the resulting iterations on a contour plot. My gradient descent implementation gives the correct result (tested with Sklearn)…

python optimization machine-learning linear-regression gradient-descent

asked Jun 06 '18 at 14:54

Xavier Bourret Sicotte

votes

1 answer

TensorFlow average gradients over several batches

This is a possible duplicate of Tensorflow: How to get gradients per instance in a batch?. I ask it anyway, because there has not been a satisfying answer and the goal here is a bit different. I have a very big network that I can fit on my GPU but…

machine-learning backpropagation gradient-descent tensorflow

asked Aug 31 '17 at 17:33

niko

1,128
1
11
25

votes

1 answer

Gradient of a Loss Function for an SVM

I'm working on this class on convolutional neural networks. I've been trying to implement the gradient of a loss function for an svm and (I have a copy of the solution) I'm having trouble understanding why the solution is correct. On this page it…

python computer-vision svm linear-regression gradient-descent

asked Jul 26 '16 at 04:52

David

1,398
1
14
20

votes

1 answer

Loss with custom backward function in PyTorch - exploding loss in simple MSE example

Before working on something more complex, where I knew I would have to implement my own backward pass, I wanted to try something nice and simple. So, I tried to do linear regression with mean squared error loss using PyTorch. This went wrong (see…

machine-learning deep-learning pytorch loss-function gradient-descent

asked Jan 29 '21 at 00:46

Björn

votes

2 answers

Why torch.sum() before doing .backward()?

I can see what this code below from this video is trying to do. But the sum from y=torch.sum(x**2) confuses me. With sum operation, y becomes a tensor with one single value. As I understand .backward() as calculating derivatives, why would we want…

python matplotlib machine-learning pytorch gradient-descent

asked Aug 02 '19 at 06:12

PuffedRiceCrackers

votes

1 answer

Correct backpropagation in simple perceptron

Given the simple OR gate problem: or_input = np.array([[0,0], [0,1], [1,0], [1,1]]) or_output = np.array([[0,1,1,1]]).T If we train a simple single-layered perceptron (without backpropagation), we could do something like this: import numpy as…

python machine-learning backpropagation gradient-descent perceptron

asked May 10 '19 at 06:01

alvas

115,346
109
446
738

votes

1 answer

Backpropagation with Momentum

I'm following this tutorial for implementing the Backpropagation algorithm. However, I am stuck at implementing momentum for this algorithm. Without Momentum, this is the code for weight update method: def update_weights(network, row, l_rate): …

python algorithm neural-network backpropagation gradient-descent

asked Nov 09 '17 at 21:03

Jaswanth Kumar

3,531
3
23
26

votes

1 answer

Where is the code for gradient descent?

Running some experiments with TensorFlow, want to look at the implementation of some functions just to see exactly how some things are done, started with the simple case of tf.train.GradientDescentOptimizer. Downloaded the zip of the full source…

python tensorflow machine-learning gradient-descent

asked Nov 08 '17 at 11:18

rwallace

31,405
40
123
242

votes

3 answers

Is Stochastic gradient descent a classifier or an optimizer?

I am new to Machine Learning and I am trying analyze the classification algorithm for a project of mine. I came across SGDClassifier in sklearn library. But a lot of papers have referred to SGD as an optimization technique. Can someone please…

machine-learning scikit-learn classification gradient-descent

asked Aug 02 '17 at 08:15

Arpitha

Prev 1 2 3

…

95 96 Next