Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

Wiki:

Gradient descent is a first-order iterative optimization algorithm. It is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).

To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm.

Gradient descent is also known as steepest descent, or the method of steepest descent.


Tag usage:

Questions on should be about implementation and programming problems, not about the theoretical properties of the optimization algorithm. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.


Read more:

1428 questions
4
votes
2 answers

How to directly set the gradient of a layer before backpropagation?

Imagine a tiny network defined as follows, where linear is a typical helper function defining TensorFlow variables for a weight matrix and activation function: final_layer = linear(linear(_input,10,tf.nn.tanh),20) Normally this would be optimized…
zergylord
  • 4,368
  • 5
  • 38
  • 60
4
votes
2 answers

Backpropagation in Gradient Descent for Neural Networks vs. Linear Regression

I'm trying to understand "Back Propagation" as it is used in Neural Nets that are optimized using Gradient Descent. Reading through the literature it seems to do a few things. Use random weights to start with and get error values Perform Gradient…
4
votes
1 answer

Gradient descent vectorised computation dimensions not correct

I have 1 input layer, 2 hidden layers and 1 output layer and for a single training example x with output y I have computed following : x = [1;0;1]; y = [1;1;1]; theta1 = 4.7300 3.2800 1.4600 0 0 0 …
blue-sky
  • 51,962
  • 152
  • 427
  • 752
4
votes
1 answer

AdamOptimizer and GradientDescentOptimizer from tensorflow not able to fit simple data

Similar question: Here I am trying out TensorFlow. I generated simple data which is linearly separable and tried to fit a linear equation to it. Here is the code. np.random.seed(2010) n = 300 x_data = np.random.random([n, 2]).tolist() y_data = [[1.,…
anuml
  • 43
  • 1
  • 4
4
votes
1 answer

How to write the updateGradInput and accGradParameters in torch?

I know the two functions are for torch's backward propagation and the interface is as follows updateGradInput(input, gradOutput) accGradParameters(input, gradOutput, scale) I'm confused about what the gradInput and gradOutput really mean in a…
4
votes
1 answer

Inception-v3 using RMSProp epsilon=1

I just read the Inception-v3 paper, and its training code released by the authors. And I found that when do RMSProp optimization, the authors used epsilon=1. However, to my knowledge, people usually used 1e-10 or some small values and Tensorflow set…
ffmpbgrnn
  • 41
  • 1
  • 3
4
votes
1 answer

Wrong weights using batch gradient descent

I am working on linear regression with two-dimensional data but I cannot get the correct weights for the regression line. There seems to be a problem with the following code because the calculated weights for the regression line are not…
evolved
  • 1,850
  • 19
  • 40
4
votes
3 answers

How to check if gradient descent with multiple variables converged correctly?

In linear regression with 1 variable I can clearly see on plot prediction line and I can see if it properly fits the training data. I just create a plot with 1 variable and output and construct prediction line based on found values of Theta 0 and…
Erba Aitbayev
  • 4,167
  • 12
  • 46
  • 81
4
votes
1 answer

Subscript indices must be real positive integers or logicals

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta % theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps…
ks4929
  • 41
  • 1
  • 3
4
votes
2 answers

Logistic regression - Calculating cost function returns wrong results

I just started taking Andrew Ng's course on Machine Learning on Coursera. The topic of the third week is logistic regression, so I am trying to implement the following cost function. The hypothesis is defined as: where g is the sigmoid function:…
4
votes
1 answer

Dot Product between 2 matrices in ruby, most efficient way

I am writing a machine learning algorithm in ruby that uses gradient descent and logistic regression. The algorithms works fine, except that in ruby the dot product between matrices is very slow. I started using a gem RubyPython that allows you to…
Leon
  • 1,262
  • 3
  • 20
  • 41
4
votes
1 answer

gradient descent in neural network training plateauing

I've been trying to implement a basic back-propogation neural network in python, and have finished the programming for initializing and training the weight-set. However, on all the sets I train, the error (mean-squared) always converges to a weird…
4
votes
0 answers

Is there an optimizer in Apache Spark / MLLib that accepts a custom Loss Function that does not require a gradient?

I am just beginning to experiment with Apache Spark / MLLib and I would like to try fitting a model that has a difficult to differentiate likelihood function. I know in R that the optimization algorithms do not require specifying a gradient (I…
4
votes
2 answers

minFunc package usage

I have been using MATLAB fminunc function to solve my optimization problem. I want to try the minFunc package : http://www.di.ens.fr/~mschmidt/Software/minFunc.html When using fminunc, I defined a function funObj.m which gives me the objective…
Swami
  • 71
  • 1
  • 10
4
votes
1 answer

Newton's Gradient Descent Linear Regression

I am trying to implement a function in MatLab that calculates the optimum linear regression using Newton's method. However, I became stuck in one point. I don't know how to find the second derivative. So I cannot implement it. Here is my…