Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

Wiki:

Gradient descent is a first-order iterative optimization algorithm. It is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).

To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm.

Gradient descent is also known as steepest descent, or the method of steepest descent.


Tag usage:

Questions on should be about implementation and programming problems, not about the theoretical properties of the optimization algorithm. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.


Read more:

1428 questions
-1
votes
1 answer

Implementing sub gradient Stochastic descent in python

I want to implement subgradient and Stochastic descent using a cost function, calculate the number of iterations that it takes to find a perfect classifier for the data and also the weights (w) and bias (b). the dataset is in four dimension this is…
Wilfred Labue
  • 43
  • 3
  • 13
-1
votes
1 answer

How should I read the sum of the values that RMSPROP produces?

I have a 2D time series data-set with integers ranging in 1,000,000 - 2,000,000 output on any given day. Of course my data is not limited, as I can sum up to weekly values hence the range increasing to over 10,000,000. I'm able to achieve RMSE =…
-1
votes
2 answers

My custom loss function in Pytorch does not train

My custom loss function in Pytorch does not update during training. The loss stays exactly the same. I am trying to write this custom loss function based on the false positive and negative rates. I am giving you a simplified version of the code. Any…
-1
votes
2 answers

Why isn't my gradient descent algorithm working?

I made a gradient descent algorithm in Python and it doesn't work. My m and b values keep increasing and never stop until I get the -inf error or the overflow encountered in square error. import numpy as np x = np.array([2,3,4,5]) y =…
-1
votes
2 answers

Best Way to Overcome Early Convergence for Machine Learning Model

I have a machine learning model built that tries to predict weather data, and in this case I am doing a prediction on whether or not it will rain tomorrow (a binary prediction of Yes/No). In the dataset there is about 50 input variables, and I have…
-1
votes
1 answer

How to solve logistic regression using gradient descent in octave?

I am learning Machine Learning course from coursera from Andrews Ng. I have written a code for logistic regression in octave. But, it is not working. Can someone help me? I have taken the dataset from the following link: Titanic survivors Here is my…
-1
votes
1 answer

How to do a gradient descent problem (machine learning)?

could somebody please explain how to do a gradient descent problem WITHOUT the context of the cost function? I have seen countless tutorials that explain gradient descent using the cost function, but I really don't understand how it works in a more…
-1
votes
1 answer

Why does the intercept parameter increases in an unexpected direction?

I'm doing 2 gradient descent iterations (initial condition: learning_rate = 0.1, and [w0,w1] = [0,0]) to find the 2 parameters (y_hat = w0 + w1*x) for linear model that fits a simple dataset, x=[0,1,2,3,4] and y=[0,2,3,8,17]. By using the closed…
-1
votes
1 answer

Will gradient descent be stuck in non-minima point? How can we prove its correctness?

For stuck example, let our cost function be J(x,y) = x * y and we are currently at point (0,0) Then the gradient vector will be (0,0). That means we will not move to any other point with the gradient descent algorithm. For the later question, let's…
-1
votes
1 answer

Why is gradient descent not working properly?

This is my first attempt at encoding a multilayer neural network in Python (code is attached below). I'm having a hard time trying to use the gradient descent partial derivatives, because it seems that the weights are not being updated properly.…
-1
votes
1 answer

gradient descent algorithm in machine learning

I am beginner in machine learning.ihave problem in gradient descent algo.in the code, mentioned below, my doubt is during first iteration value of x will be 1 second iteration value of x will be 2 third iteration value of x will be 3 fourth…
-1
votes
1 answer

Python function returning wrong value

I have a function (gradient descent) in python that return me some values: import pandas as pd import numpy as np import matplotlib.pyplot as plt def read_data(file): df = pd.read_excel(file) x_data= np.array(df['X_axis']) y_data =…
-1
votes
1 answer

Poor Accuracy of Gradient Descent Perceptron

I'm trying to make a start with neural networks from the very beginning. This means starting off toying with perceptrons. At the moment I'm trying to implement batch gradient descent. The guide I'm following provided the following pseudocode: I've…
Danny
  • 435
  • 1
  • 5
  • 17
-1
votes
1 answer

Python gradient descent not converge

So I'm a newbie to machine-learning and i have been trying to implement gradient descent. My code seems to be right (I think) but it didn't converge to the global optimum. import numpy as np import pandas as pd import matplotlib.pyplot as plt def…
-1
votes
1 answer

How to increase accuracy of network running on MNIST

I followed this code: https://github.com/HyTruongSon/Neural-Network-MNIST-CPP It is quite easy to understand. It produces 94% accuracy. I have to convert it to a network with deeper layers(ranging from 5 to 10). In order to make my self comfortable,…