Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

Wiki:

Gradient descent is a first-order iterative optimization algorithm. It is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).

To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm.

Gradient descent is also known as steepest descent, or the method of steepest descent.

Tag usage:

Questions on gradient-descent should be about implementation and programming problems, not about the theoretical properties of the optimization algorithm. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

Read more:

1428 questions

votes

1 answer

gradient descent as applied to feature vector bag of words classification task

I've watched the Andrew Ng videos over and over and still I don't understand how to apply gradient descent to my problem. He deals pretty much exclusively in the realm of high level conceptual explanations but what I need are ground level tactical…

java machine-learning gradient-descent

asked Mar 07 '15 at 14:59

smatthewenglish

2,831
4
36
72

votes

0 answers

ADADELTA preserving randomly initialized weights in neural network

I am attempting to train a 2 hidden layer tanh neural neural network on the MNIST data set using the ADADELTA algorithm. Here are the parameters of my setup: Tanh activation function 2 Hidden layers with 784 units (same as the number of input…

machine-learning neural-network gradient-descent

asked Mar 03 '15 at 07:15

Jeremy Salwen

8,061
5
50
73

votes

0 answers

How to improve gradient descent backpropogation speed in MATLAB Neural Network Toolbox?

I am currently training several hundred different permutations of neural networks. Using Levenberg-Marquardt backpropogation yields results relatively fast, however I prefer if I use gradient descent for now for academic reasons. Unfortunately,…

matlab neural-network gradient-descent

asked Mar 01 '15 at 19:08

mesllo

votes

1 answer

How can I add concurrency to neural network processing?

The basics of neural networks, as I understand them, is there are several inputs, weights and outputs. There can be hidden layers that add to the complexity of the whole thing. If I have 100 inputs, 5 hidden layers and one output (yes or no),…

neural-network backpropagation gradient-descent

asked Feb 20 '15 at 15:28

Shamoon

41,293
91
306
570

votes

0 answers

fmin_cg not minimizing enough

while doing, just a simple implementation of grad descent (predicting a st line, with sample points as input), i pretty accurately predicted the line with iterative method, but using fmin_cg(), the accuracy went down, the first thought was to…

python scipy gradient-descent

asked Feb 03 '15 at 13:00

user3872868

votes

1 answer

Gradient decent on the inputs of a pre-trained neural network to achieve a target y-value

I have a trained neural network which suitably maps my inputs to my outputs. Is it then possible to specify a desired y output and then use a gradient decent method to determine the optimum input values to get that output? When using…

machine-learning neural-network gradient-descent

asked Jan 14 '15 at 12:08

Stephen

votes

1 answer

Bi-Threaded processing in Matlab

I have a Large-Scale Gradient Descent optimization problem that I am running using Matlab. The code has got two parts: A Sequential update part that fires every iteration that updates the parameter vector. A validation error computation part that…

multithreading parallel-processing matlab gradient-descent

asked Sep 27 '14 at 15:30

Swami

votes

0 answers

Logistic regression with gradient descent resulting different outcome for different dataset

I am trying logistic regression using gradient descent with two data set, I get different result for each of them. Dataset1 Input X= 1 2 3 1 4 6 1 7 3 1 5 5 1 5 4 1 6 4 1 3 4 1 4 5 1 1 2 1 3…

machine-learning octave logistic-regression gradient-descent

asked Jul 30 '14 at 11:16

Sam

2,545
8
38
59

votes

1 answer

Is the mini-batch gradient just the sum of online gradients?

I am adapting code for training a neural network that does online training to work for mini-batches. Is the mini-batch gradient for a weight (de/dw) just the sum of the gradients for the samples in the mini-batch? Or, is it some non-linear…

machine-learning neural-network backpropagation gradient-descent

asked Jun 28 '14 at 09:20

necromancer

23,916
22
68
115

votes

1 answer

Gradient descent not working as expected

I am using Stochastic Gradient Descent from scikit-learn http://scikit-learn.org/stable/modules/sgd.html. The example given in the link works like this: >>> from sklearn.linear_model import SGDClassifier >>> X = [[0., 0.], [1., 1.]] >>> y = [0,…

python machine-learning scipy linear-regression gradient-descent

asked Jun 25 '14 at 12:34

user227666

votes

1 answer

Vectorized gradient descent basics

I'm implementing simple gradient descent in octave but its not working. Here is the data I'm using: X = [1 2 3 1 4 5 1 6 7] y = [10 11 12] theta = [0 0 0] alpha = 0.001 and itr = 50 This is my gradient…

machine-learning octave vectorization gradient-descent

asked Jun 21 '14 at 02:58

user3762027

votes

1 answer

During Stochastic Gradient Descent, what's the differences between these two updating hypothese ways?

I have a question about updating the theta during the Stochastic GD. I have two ways to update theta: 1) Use the previous theta, to get all the hypotheses for all samples, and then update the theta by each sample. Like: hypothese = np.dot(X,…

machine-learning gradient-descent stochastic

asked May 29 '14 at 12:46

Bingyu Wang

votes

1 answer

missing value where TRUE/FALSE needed in R

When I run the following code without commenting gr.ascent(MMSE, 0.5, verbose=TRUE) I receive this error Error in b1 * x : 'b1' is missing but when I comment that line I receive the following error when testing MMSE with these arguments…

r function statistics regression gradient-descent

asked Apr 22 '14 at 02:39

Mona Jalal

34,860
64
239
408

votes

1 answer

L-BFGS from RISO not working

I am testing the implementation of RISO's L-BFGS library for function minimization for logistic regression in Java. Here is the link to the class that I am using. To test the library, I am trying to minimize the function: f(x) = 2*(x1^2) + 4*x2 +…

machine-learning svm mathematical-optimization libsvm gradient-descent

asked Jan 09 '14 at 12:15

Darth.Vader

5,079
7
50
90

votes

3 answers

Incorrect Results from Gradient Descent in Matlab

I'm taking the course in Matlab, and I have done a gradient descent implementation but it gives incorrect results. The code: for iter = 1:num_iters sumTheta1 = 0; sumTheta2 = 0; for s = 1:m sumTheta1 = theta(1) + theta(2) .* X(s,2) - y(s); …

matlab gradient-descent

asked Oct 16 '13 at 02:48

Pedro.Alonso

1,007
3
20
41

Prev 1 2 3

…

95 96 Next