Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

Wiki:

Gradient descent is a first-order iterative optimization algorithm. It is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).

To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm.

Gradient descent is also known as steepest descent, or the method of steepest descent.

Tag usage:

Questions on gradient-descent should be about implementation and programming problems, not about the theoretical properties of the optimization algorithm. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

Read more:

1428 questions

votes

1 answer

FailedPreconditionError while trying to use RMSPropOptimizer on tensorflow

I am trying to use the RMSPropOptimizer for minimizing loss. Here's the part of the code that is relevant: import tensorflow as tf #build large convnet... #... opt = tf.train.RMSPropOptimizer(learning_rate=0.0025, decay=0.95) #do stuff to get…

tensorflow gradient-descent

asked Feb 24 '16 at 00:33

aphdstudent

votes

2 answers

Gradient Descent algorithm taking long time to complete - Efficiency - Python

I am trying to implement the gradient descent algorithm using python and following is my code, def grad_des(xvalues, yvalues, R=0.01, epsilon = 0.0001, MaxIterations=1000): xvalues= np.array(xvalues) yvalues = np.array(yvalues) length =…

python algorithm python-2.7 gradient gradient-descent

asked Feb 15 '16 at 23:59

haimen

1,985
7
30
53

votes

1 answer

Trying to understand code that computes the gradient wrt to the input for LogSoftMax in Torch

Code comes from: https://github.com/torch/nn/blob/master/lib/THNN/generic/LogSoftMax.c I don't see how this code is computing the gradient w.r.t to the input for the module LogSoftMax. What I'm confused about is what the two for loops are doing. for…

mathematical-optimization torch gradient-descent softmax

asked Feb 09 '16 at 23:44

lars

1,976
5
33
47

votes

1 answer

Machine Learning - SVM - How to calculate bias when calculate vector W?

I am writing code of SVM Primal that uses SGD (Stochastic SubGradient Descent) for optimize the vector W. The classification methos is sign(w*x + bias). My question is how to find the best bias for it? I guess that it has to do during the W…

machine-learning computer-vision svm gradient-descent

asked Feb 06 '16 at 17:49

zardav

1,160
3
12
22

votes

1 answer

Can FTRL be applied on linear least squares? or is it just for logistic regression models?

I'm exploring follow-the-regularized-leader FTRL proximal gradient descent: paper, reference implementation. Everywhere FTRL is mentioned, the loss surface for the gradient decent is the LogLoss, and the model for prediction is Logistic regression.…

linear-regression logistic-regression gradient-descent

asked Feb 06 '16 at 13:42

ihadanny

4,377
7
45
76

votes

1 answer

Compute updates in Theano after N number of loss calculations

I've constructed a LSTM recurrent NNet using lasagne that is loosely based on the architecture in this blog post. My input is a text file that has around 1,000,000 sentences and a vocabulary of 2,000 word tokens. Normally, when I construct…

python theano gradient-descent lstm recurrent-neural-network

asked Feb 04 '16 at 20:51

o-90

17,045
10
39
63

votes

0 answers

Is it possible to implement gradient checking in a vectorized way when implementing neural network?

For example, I add delta to all dimensions of w from y=w.dot(x)+b and calculate dw in one time?

machine-learning neural-network gradient mathematical-optimization gradient-descent

asked Jan 11 '16 at 09:40

Jason

votes

1 answer

theano GRU rnn adam optimizer

Technical information: OS: Mac OS X 10.9.5 IDE: Eclipse Mars.1 Release (4.5.1), with PyDev and Anaconda interpreter (grammar version 3.4) GPU: NVIDIA GeForce GT 650M Libs: numpy, aeosa, Sphinx-1.3.1, Theano 0.7, nltk-3.1 My background: I am very new…

python neural-network theano gradient-descent recurrent-neural-network

asked Nov 21 '15 at 17:01

ASchneck

votes

1 answer

Represent Linear Regression features in Gradient Descent numerically

The following piece of python code works well for finding gradient descent: def gradientDescent(x, y, theta, alpha, m, numIterations): xTrans = x.transpose() for i in range(0, numIterations): hypothesis = np.dot(x, theta) …

machine-learning linear-regression gradient-descent

asked Nov 10 '15 at 13:32

Saurabh Verma

6,328
12
52
84

votes

1 answer

Treating missing values as really missing in Vowpal Wabbit

Is there a way to correctly represent missing values in VW input format -- not to impute with the mean or median, not to set them to 0 or any other constant, but to treat them as really missing, so that SGD and FTRL-Proximal algorithms could exclude…

machine-learning gradient-descent vowpalwabbit

asked Oct 23 '15 at 11:41

kurtosis

1,365
2
12
27

votes

1 answer

Gradient Descent with multiple variable without Matrix

I'm new with Matlab and Machine Learning and I tried to make a gradient descent function without using matrix. m is the number of example on my training set n is the number of feature for each example The function gradientDescentMulti takes 5…

matlab matrix machine-learning gradient-descent

asked Oct 17 '15 at 18:46

Arthur

4,870
3
32
57

votes

0 answers

Gradient Descent With Smoothness constraints

I have a noisy image Y and known kernel H. I need to estimate a denoised image X such that it gradient of X is also minimised. J= ||Y-HX||^2+ Alpha* Smoothness constraint(X); Smoothness constraint= L1norm(|| Grad(X) ||) how do i estimate the…

image optimization least-squares gradient-descent

asked Sep 30 '15 at 06:24

user2527599

votes

0 answers

Adjusting proto file for Caffe

I'm trying to modify caffe.proto in order to add 2 new fields to SolverParameter. the two lines I add, at the very end of the SolverParameter message are: optional int32 start_lr_policy = 36; // Iteration to start CLR policy described in…

c++ protocol-buffers deep-learning caffe gradient-descent

asked Sep 14 '15 at 21:29

user1245262

6,968
8
50
77

votes

1 answer

SGD with L2 regularization in mllib

I am having difficulty reading open source mllib code for SGD with L2 regularization. The code is class SquaredL2Updater extends Updater { override def compute( weightsOld: Vector, gradient: Vector, stepSize: Double, iter: Int, regParam:…

apache-spark apache-spark-mllib gradient-descent scala-breeze

asked Sep 04 '15 at 17:54

bhomass

3,414
8
45
75

votes

1 answer

how did mllib calculate gradient

Need an mllib expert to help explain the linear regression code. In LeastSquaresGradient.compute override def compute( data: Vector, label: Double, weights: Vector, cumGradient: Vector): Double = { val diff = dot(data, weights) -…

apache-spark linear-regression apache-spark-mllib gradient-descent

asked Sep 02 '15 at 01:07

bhomass

3,414
8
45
75

Prev 1 2 3

…

95 96 Next