Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

Wiki:

Gradient descent is a first-order iterative optimization algorithm. It is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).

To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm.

Gradient descent is also known as steepest descent, or the method of steepest descent.

Tag usage:

Questions on gradient-descent should be about implementation and programming problems, not about the theoretical properties of the optimization algorithm. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

Read more:

1428 questions

votes

1 answer

Backpropagation neural network, too many neurons in layer causing output to be too high

Having neural network with alot of inputs causes my network problems like Neural network gets stuck and feed forward calculation always gives output as 1.0 because of the output sum being too big and while doing backpropagation, sum of gradients…

asked May 24 '16 at 15:19

Aiden Anomaly

votes

2 answers

Reasons not to use tf.train.AdamOptmizer?

I've read this article and it seems like, given enough memory, you should always use Adam over the other possible optimization algorithms (adadelta, rmsprop, vanilla sgd, etc). Are there any examples, either toy or real world, in which Adam will do…

tensorflow gradient-descent

asked May 24 '16 at 12:53

George

1,843
2
13
24

votes

1 answer

Translating Logistic Regression loss function to Softmax

I currently have a program which takes a feature vector and classification, and applies it to a known weight vector to generate a loss gradient using Logistic Regression. This is that code: double[] grad = new double[featureSize]; //dot…

java logistic-regression gradient-descent softmax

asked May 22 '16 at 23:04

user2785277

votes

1 answer

Instead LBFGS, using gradient descent in sparse autoencoder

In Andrew Ng's lecture notes, they use LBFGS and get some hidden features. Can I use gradient descent instead and produce the same hidden features? All the other parameters are the same, just change the optimization algorithm. Because When I use…

machine-learning mathematical-optimization deep-learning gradient-descent autoencoder

asked May 16 '16 at 07:21

iTS

votes

0 answers

Gradient Boosting Classifier-n_estimators

I am trying Gradient Boosting Classifier for my project. I am using 100 samples. I have used Leave one out cross validation. As far as i know, GBC should give good results with large n_estimators. But i am getting low results with large…

python-3.x artificial-intelligence gradient gradient-descent boosting

asked May 15 '16 at 19:38

user3419487

votes

1 answer

Gradient descent not updating theta values

Using the vectorized version of gradient as described at : gradient descent seems to fail theta = theta - (alpha/m * (X * theta-y)' * X)'; The theta values are not being updated, so whatever initial theta value this is the values that is set…

matlab machine-learning neural-network gradient-descent

asked May 14 '16 at 17:10

blue-sky

51,962
152
427
752

votes

2 answers

newff and train functions of python's neurolab is giving inconsistent results for same code and input

While the input is the same and the code is the same, I get two different results when run multiple time. There are only two unique outputs though. I do not know what part of the code is randomized and I'm having a hard time figuring out where the…

python machine-learning gradient-descent

asked May 01 '16 at 20:38

Baalzamon

votes

1 answer

Maximum Likelihood Estimation of a log function with sevaral parameters

I am trying to find out the parameters for the function below: $$ \log L(\alpha,\beta,v) = v/\beta(e^{-\beta T} -1) + \alpha/\beta \sum_{i=1}^{n}(e^{-\beta(T-t_i)} -1) + \sum_{i=1}^{N}log(v e^{-\beta t_i} + \alpha \sum_{j=1}^{jmax(t_i)}…

optimization mathematical-optimization gradient-descent fminsearch

asked Apr 29 '16 at 05:41

user1707718

votes

1 answer

Active Contours (Snakes) Gradient Decent

I am doing research on the active contour (snake) using gradient decent which was implemented by Kass. The two pieces of documentation that I have been reading can be found here: Original paper and A more descriptive version My question is in…

image-processing contour gradient-descent

asked Apr 22 '16 at 02:27

user6112514

votes

1 answer

"Function with duplicate name cannot be defined" error but no duplicate function

While trying to write a function for gradient descent in Matlab I got the following error: Function with duplicate name "gradientDescent" cannot be defined. The program I'm working on has two functions in it, and when I remove the second one the…

matlab gradient-descent

asked Apr 17 '16 at 18:09

Paco Poler

votes

2 answers

Plot vectors of gradient descent in R

I've code gradient descent algorithm in R and now I'm trying to "draw" the path of the vectors. I've got draw points in my contour plot, but it's not correct because nobody knows what happened first. In my algorith always I have an previous state…

r plot machine-learning gradient-descent

asked Apr 14 '16 at 09:03

Carlos

votes

1 answer

Neural Network bad convergeance

I read a lot about NN last two weeks, I think i saw pretty much every "XOR" approach tutorials on net. But, i wasn't able to make work my own one. I started by a simple "OR" neuron approach. Giving good results. I think my problem is in…

java neural-network xor backpropagation gradient-descent

asked Mar 23 '16 at 12:28

Sebastien Servouze

votes

1 answer

Mutable Vector field is not updating in F#

let gradientDescent (X : Matrix) (y :Vector) (theta : Vector) alpha (num_iters : int) = let J_history = Vector.Build.Dense(num_iters) let m = y.Count |> double theta.At(0, 0.0) let x = …

f# linear-regression gradient-descent

asked Mar 05 '16 at 15:53

Luke Xu

2,302
3
19
43

votes

0 answers

Why no automatic termination for stochastic gradient descent in the frameworks?

I checked out some notable open-source frameworks with SGD implementations - scikit-learn, vowpal-wabbit and tensor-flow. All of them leave the task of deciding how many iterations to the user! scikit requires the user to specify it explicitly,…

scikit-learn tensorflow gradient-descent vowpalwabbit

asked Mar 01 '16 at 21:30

ihadanny

4,377
7
45
76

votes

1 answer

Non-vectorized Gradient Descent

I have a bug in the following code, which is returning inf, inf for the Thetas. def gradient_descent(x, y, t0, t1, alpha, num_iters): for i in range(num_iters): t0_sum = 0 t1_sum = 0 for i in range(m_num): # I have a feeling that the following…

python-3.x machine-learning gradient-descent

asked Feb 24 '16 at 01:13

David Duffrin

Prev 1 2 3

…

95 96 Next