Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

Wiki:

Gradient descent is a first-order iterative optimization algorithm. It is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).

To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm.

Gradient descent is also known as steepest descent, or the method of steepest descent.

Tag usage:

Questions on gradient-descent should be about implementation and programming problems, not about the theoretical properties of the optimization algorithm. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

Read more:

1428 questions

votes

1 answer

None Value while optimizing data with gradient descent

I'm trying to make a small neural network in tensorflow and I'm a bit new in this. I saw this in a tutorial (http://de.slideshare.net/tw_dsconf/tensorflow-tutorial) and everything is working fine till I try to optimize the weights (with gradient…

asked Sep 13 '16 at 13:04

Diego Alejandro Gómez Pardo

votes

1 answer

SGD convergence test using learning rates

Can anyone give an explanation for the convergence test presented int the 8th minute of this lecture by Hugo Larochelle ?

machine-learning neural-network gradient-descent convergence

asked Sep 12 '16 at 14:24

guser

votes

1 answer

Logistic Regression with Gradient Descent on large data

I have a training set with about 300000 examples and about 50-60 features and also it's a multiclass with about 7 classes. I have my logistic regression function that finds out the convergence of the parameters using gradient descent. My gradient…

logistic-regression gradient-descent

asked Sep 06 '16 at 18:12

Sera_Vinicit

votes

1 answer

linear regression by gradient desent in R

I am very new to machine learning and currently is trying to do a linear regression using R, my code is below: x <- runif(1000, -5, 5) y <- runif(1000, -2, 2) z <- x + y res <- lm(z ~ x + y) alpha <- 0.01 num_iters <- 10000 theta <- matrix(c(0,0,0),…

r machine-learning gradient-descent

asked Aug 19 '16 at 09:07

albert_wzh

votes

1 answer

Tensorflow gradients are always zero

Tensorflow gradients are always zero with respect to conv layers that are after first conv layer. I've tried different ways to check that but gradients are always zero! Here is the small reproducible code that can be run to check that. from…

python tensorflow mathematical-optimization gradient-descent

asked Aug 18 '16 at 10:06

shader

votes

1 answer

Implement bias neurons neural network

I implemented bias units for my neural network with gradient descent. But I'm not 100% sure If I've implemented it the right way. Would be glade if you can quickly look through my code. Only the parts with if bias: are important. And my second…

python neural-network backpropagation gradient-descent bias-neuron

asked Aug 16 '16 at 16:28

Peter234

1,052
7
24

votes

1 answer

Non-Symbolic loss in Keras/TensorFlow

For a university project, I want to train a (simulated) robot to hit a ball given the position and velocity. The first thing to try is policy gradients: I have a parametric trajectory generator. For every training position, I feed the position…

neural-network tensorflow theano keras gradient-descent

asked Aug 11 '16 at 12:39

jcklie

4,054
3
24
42

votes

0 answers

Why is the gradient of tf.sign() not equal to 0?

I expected the gradient for tf.sign() in TensorFlow to be equal to 0 or None. However, when I examined the gradients, I found that they were equal to very small numbers (e.g. 1.86264515e-09). Why is that? (If you are curious as to why I even want to…

tensorflow gradient-descent

asked Aug 02 '16 at 17:27

random_stuff

votes

1 answer

Stochastic Gradient Descent design matrix too big for R

I'm trying to implement a baseline prediction model of movie ratings (akin to the various baseline models from the NetFlix prize), with parameters learned via stochastic gradient descent. However, because both explanatory variables are categorical…

r machine-learning gradient-descent

asked Aug 01 '16 at 20:37

Caio Kenup

votes

1 answer

octave:steepest descent : how to minimize an equation

I am new with Octave.Now I am trying to implement steepest descent algorithm in Octave. For example minimization of f(x1,x2) = x1^3 + x2^3 - 2*x1*x2 Estimate starting design point x0, iteration counter k0, convergence parameter tolerence = 0.1.…

machine-learning octave gradient-descent

asked Jul 31 '16 at 08:30

voxter

votes

1 answer

Better alternative to gradient descent

Is there any method that is faster and more efficient than gradient descent for updating weights in a neural network. Can we use multiplicative weight update in place of gradient-descent. Is it better

gradient-descent

asked Jul 19 '16 at 08:14

user6460588

votes

1 answer

MNIST Tensorflow vs code from Michael Nielsen

I read Michael Nielsen's book neuralnetworksanddeeplearning.com about Neural networks. He always does the example with the MNIST data. I now took his code and designed exactly the same network in Tensorflow, but I realized that the results in…

tensorflow gradient-descent mnist

asked Jun 18 '16 at 12:19

jojo123456

votes

1 answer

Gradient descent values not correct

I'm attempting to implement gradient descent using code from : Gradient Descent implementation in octave I've amended code to following : X = [1; 1; 1;] y = [1; 0; 1;] m = length(y); X = [ones(m, 1), data(:,1)]; theta = zeros(2, 1); …

machine-learning gradient-descent

asked Jun 15 '16 at 20:58

thepen

votes

0 answers

Gradient descent not working without normalization, why?

My question is based on the data from Coursera course - https://www.coursera.org/learn/machine-learning/, but after a search is appears to be a common problem. The gradient descent works perfectly on normalize data (pic.1), but goes in wrong…

machine-learning gradient-descent

asked Jun 06 '16 at 11:33

Vladimir Kotrovskiy

votes

1 answer

Implementing gradient descent with Scala and Breeze - error : could not find implicit value for parameter op:

I'm attempting to apply a gradient descent implementation in Scala and breeze based on Octave from : Gradient Descent implementation in octave The octave code I'm attempting to re-write is : theta = theta -((1/m) * ((X * theta) - y)' * X)' *…

scala breeze gradient-descent

asked Jun 03 '16 at 22:21

thepen

Prev 1 2 3

…

95 96 Next