Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

Wiki:

Gradient descent is a first-order iterative optimization algorithm. It is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).

To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm.

Gradient descent is also known as steepest descent, or the method of steepest descent.

Tag usage:

Questions on gradient-descent should be about implementation and programming problems, not about the theoretical properties of the optimization algorithm. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

Read more:

1428 questions

votes

1 answer

Is there a Python library where I can import a gradient descent function/method?

One way to do gradient descent in Python is to code it myself. However, given how popular a concept it is in machine learning, I was wondering if there is a Python library that I can import that gives me a gradient descent method (preferably…

python tensorflow import gradient-descent mini-batch

asked May 08 '18 at 04:12

Kevin Trinh

votes

1 answer

Gradient Descent for Linear Regression Exploding

I am trying to implement gradient descent for linear regression using this resource: https://spin.atomicobject.com/2014/06/24/gradient-descent-linear-regression/ My problem is that my weights are exploding (increasing exponentially) and essentially…

machine-learning linear-regression gradient-descent

asked May 07 '18 at 16:54

elMentat

votes

3 answers

Pytorch: Gradient of output w.r.t parameters

I'm interested in Finding the Gradient of Neural Network output with respect to the parameters (weights and biases). More specifically, assume I have the following Neural Network Structure [6,4,3,1]. The input samples size is 20. What I'm interested…

python neural-network pytorch gradient-descent autograd

asked May 04 '18 at 13:06

Sibghat Khan

votes

1 answer

Implementing back propagation using numpy and python for cleveland dataset

I wanted to predict heart disease using backpropagation algorithm for neural networks. For this I used UCI heart disease data set linked here: processed cleveland. To do this, I used the cde found on the following blog: Build a flexible Neural…

python numpy neural-network backpropagation gradient-descent

asked Apr 30 '18 at 17:18

Tarun Khare

1,447
6
25
43

votes

1 answer

Differentiate gradients

Is there a way to differentiate gradients in PyTorch? For example, I can do this in TensorFlow: from pylab import * import tensorflow as tf tf.reset_default_graph() sess = tf.InteractiveSession() def gradient_descent( loss_fnc, w, max_its, lr): …

python neural-network deep-learning gradient-descent pytorch

asked Mar 07 '18 at 10:32

firdaus

votes

0 answers

How to implement custom gradient of a tensor using eager execution in TensorFlow

In this(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/g3doc/guide.md) tutorial, a method of implementing custom gradient function is provided. @tfe.custom_gradient def log1pexp(x): e = tf.exp(x) def…

python tensorflow gradient-descent

asked Jan 04 '18 at 03:44

Sirui Lu

votes

0 answers

how to use iter_size in caffe

I dont know the exact meaning of 'iter_size' in caffe solver though I googled a lot. it always says that 'iter_size' is a way to effectively increase the batch size without requiring the extra GPU memory. Could I understand it as this: If set…

machine-learning neural-network deep-learning caffe gradient-descent

asked Aug 18 '17 at 04:47

spider

votes

2 answers

Simple gradient descent using mxnet

I'm trying to use MXNet's gradient descent optimizers to minimize a function. The equivalent example in Tensorflow would be: import tensorflow as tf x = tf.Variable(2, name='x', dtype=tf.float32) log_x = tf.log(x) log_x_squared =…

python gradient-descent mxnet

asked Jun 28 '17 at 04:32

user3363678

votes

1 answer

Eligibility traces in TensorFlow

According to Sutton's book - Reinforcement Learning: An Introduction, the update equation of Network weights is given by: where et is the eligibility trace. This is similar to a Gradient Descent update with an extra et. Can this eligibility trace…

tensorflow gradient-descent reinforcement-learning

asked Jun 06 '17 at 03:59

nikpod

1,238
14
22

votes

1 answer

how does xgboost enforce monotonicity constraints

I would like to know that how xgboost enforce monotonic constraints while building the tree model. So far by reading the code, I have understood that it has something to do with weights of each node but am not able to understand why this approach…

machine-learning xgboost gradient-descent ensemble-learning

asked May 29 '17 at 06:42

yakcoder

votes

1 answer

Tensorflow Optimizers - multiple loss values passed to minimize()?

My first time using Tensorflow on the MNIST dataset, I had a really simple bug where I forgot to take mean of my error values before passing it to the optimizer. In other words, instead of loss =…

python tensorflow neural-network gradient-descent

asked May 11 '17 at 22:06

ejlu

votes

1 answer

How does changing batch size results in different prediction time?

I trained a data set(~8000 images) using Caffe and a batch size of 5 with Alex net network. This results in a prediction time of (800-900)ms. Then i changed the batch size to 56(maximum my machine can support) and the prediction time reduced to…

machine-learning deep-learning caffe conv-neural-network gradient-descent

asked Feb 28 '17 at 18:30

danishansari

votes

2 answers

Gradient Descent: thetas not converging

I'm trying to figure out gradient descent with Octave. With each iteration, my thetas get exponentially larger. I'm not sure what the issue is as I'm copying another function directly. Here are my matrices: X = 1 98 1 94 1 93 1 88 1…

matrix machine-learning octave linear-regression gradient-descent

asked Feb 19 '17 at 19:49

Mark Hodges

votes

3 answers

Why deep NN can't approximate simple ln(x) function?

I have created ANN with two RELU hidden layers + linear activation layer and trying to approximate simple ln(x) function. And I am can't do this good. I am confused because lx(x) in x:[0.0-1.0] range should be approximated without problems (I am…

tensorflow neural-network regression deep-learning gradient-descent

asked Jan 09 '17 at 15:08

Brans Ds

4,039
35
64

votes

1 answer

Gradient Descent algorithm not converging in Haskell

I am trying to implement the gradient descent algorithm in Andrew Ng's ML course. After reading in the data, I try to implement the following, updating my list of theta values 1000 times, with the expectation of some convergence. The algorithm in…

algorithm haskell machine-learning functional-programming gradient-descent

asked Nov 10 '16 at 12:46

user4884986

Prev 1 2 3

…

95 96 Next