Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

Wiki:

Gradient descent is a first-order iterative optimization algorithm. It is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).

To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm.

Gradient descent is also known as steepest descent, or the method of steepest descent.

Tag usage:

Questions on gradient-descent should be about implementation and programming problems, not about the theoretical properties of the optimization algorithm. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

Read more:

1428 questions

votes

2 answers

TypeError: minimize() missing 1 required positional argument: 'var_list'

I am trying to minimize the loss using SGD, but its throwing error while I am using SGD, I am trying to do it in tensorflow 2.0, one additional parameter that is causing issue is var_list import tensorflow as tf import numpy import matplotlib.pyplot…

asked Nov 06 '19 at 03:16

Akshay

votes

1 answer

Why doesn't my custom made linear regression model match sklearn?

I'm attempting to create a simple linear model with Python using no libraries (other than numpy). Here's what I have import numpy as np import pandas np.random.seed(1) alpha = 0.1 def h(x, w): return np.dot(w.T, x) def cost(X, W, Y): …

python numpy machine-learning scikit-learn gradient-descent

asked Feb 08 '19 at 02:02

Shamoon

41,293
91
306
570

votes

2 answers

computing gradients for every individual sample in a batch in PyTorch

I'm trying to implement a version of differentially private stochastic gradient descent (e.g., this), which goes as follows: Compute the gradient with respect to each point in the batch of size L, then clip each of the L gradients separately, then…

python pytorch gradient-descent

asked Dec 15 '18 at 23:08

chirpchirp

votes

2 answers

How to include a custom filter in a Keras based CNN?

I am working on a fuzzy convolution filter for CNNs. I have the function ready - it takes in the 2D input matrix and the 2D kernel/weight matrix. The function outputs the convolved feature or the activation map. Now, I want to use Keras to build the…

python keras conv-neural-network backpropagation gradient-descent

asked Aug 20 '18 at 12:01

Rangan Das

votes

1 answer

tf.gradients() sums over ys, does it?

https://www.tensorflow.org/versions/r1.6/api_docs/python/tf/gradients In the documentation for tf.gradients(ys, xs) it states that Constructs symbolic derivatives of sum of ys w.r.t. x in xs I am confused about the summing part, I have read…

python python-3.x tensorflow machine-learning gradient-descent

asked Aug 15 '18 at 12:45

Mark

votes

2 answers

Confused usage of dropout in mini-batch gradient descent

My question is in the end. An example CNN trained with mini-batch GD and used the dropout in the last fully-connected layer (line 60) as fc1 = tf.layers.dropout(fc1, rate=dropout, training=is_training) At first I thought the tf.layers.dropout or…

tensorflow machine-learning neural-network gradient-descent mini-batch

asked Feb 05 '18 at 08:06

kbxu

votes

1 answer

Difference between GradientDescentOptimizer and AdamOptimizer in tensorflow?

When using GradientDescentOptimizer instead of Adam Optimizer the model doesn't seem to converge. On the otherhand, AdamOptimizer seems to work fine. Is the something wrong with the GradientDescentOptimizer from tensorflow? import matplotlib.pyplot…

python machine-learning tensorflow regression gradient-descent

asked Sep 16 '17 at 14:58

test

votes

1 answer

`warm_start` Parameter And Its Impact On Computational Time

I have a logistic regression model with a defined set of parameters (warm_start=True). As always, I call LogisticRegression.fit(X_train, y_train) and use the model after to predict new outcomes. Suppose I alter some parameters, say, C=100 and call…

scikit-learn logistic-regression gradient-descent hyperparameters

asked Aug 12 '17 at 13:56

Techie Fort

votes

1 answer

keras loss jumps to zero randomly at the start of a new epoch

I'm training a network which has multiple losses and both creating and feeding the data into my network using a generator. I've checked the structure of the data and it looks fine generally and it also trains pretty much as expected the majority of…

neural-network keras training-data gradient-descent

asked Jul 26 '17 at 09:04

tryingtolearn

2,528
7
26
45

votes

0 answers

How to train a model in keras with multiple input-output datasets with different batch sizes

I have a supervised learning problem that I am solving with the Keras functional API. As this model is predicting the state of a physical system, I know the supervised model should follow additional constraints. I would like to add that as an…

neural-network keras gradient-descent regularized

asked May 13 '17 at 20:22

thorbjorn444

votes

1 answer

Gradient calculation in Hamming loss for multi-label classification

I am doing a multilabel classification using some recurrent neural network structure. My question is about the loss function: my output will be vectors of true/false (1/0) values to indicate each label's class. Many resources said the Hamming loss…

machine-learning neural-network gradient-descent hamming-distance multilabel-classification

asked Feb 08 '17 at 23:21

William Chou

votes

0 answers

Implementing Feedback Alignment in Tensorflow

I want to implement Direct Feedback Alignemnt in Tensorflow. Reference paper: https://arxiv.org/pdf/1609.01596v5.pdf, Nøkland (2016) I implemented a simple network that does DFA in pure Python, having explicitly the backprop, I just switched the…

python tensorflow neural-network gradient-descent

asked Feb 04 '17 at 16:16

iacolippo

4,133
25
37

votes

2 answers

Steepest descent to find the solution to a linear system with a Hilbert matrix

I am using the method of steepest descent to figure out the solution to a linear system with a 5x5 Hilbert matrix. I believe the code is fine in the regard that it gives me the right answer. My problem is that: I think it is taking too many…

matlab optimization mathematical-optimization numerical-methods gradient-descent

asked Oct 06 '16 at 02:49

DudeWah

votes

1 answer

Is there hope for using Lasagne's Adam implementation for Probabilistic Matrix Factorization?

I am implementing Probabilistic Matrix Factorization models in theano and would like to make use of Adam gradient descent rules. My goal is to have a code that is as uncluttered as possible, which means that I do not want to keep explicitly track of…

machine-learning neural-network theano gradient-descent lasagne

asked Sep 09 '16 at 15:32

fstab

4,801
8
34
66

votes

1 answer

Implementing gradient descent in TensorFlow instead of using the one provided with it

I want to use gradient descent with momentum (keep track of previous gradients) while building a classifier in TensorFlow. So I don't want to use tensorflow.train.GradientDescentOptimizer but I want to use tensorflow.gradients to calculate…

tensorflow gradient-descent

asked Aug 26 '16 at 13:06

prepmath

Prev 1 2 3

…

95 96 Next