Is Stochastic gradient descent a classifier or an optimizer?

Question

I am new to Machine Learning and I am trying analyze the classification algorithm for a project of mine. I came across SGDClassifier in sklearn library. But a lot of papers have referred to SGD as an optimization technique. Can someone please explain how is SGDClassifier implemented?

Stochastic gradient descent is a stochastic approximation of the gradient descent optimization method for minimizing an objective function that is written as a sum of differentiable functions. In other words, SGD tries to find minima or maxima by iteration. — seralouk, Aug 02 '17 at 09:50

score 4 · Answer 1 · answered Nov 02 '17 at 14:16

4

Taken from SGD sikit-learn documentation

loss="hinge": (soft-margin) linear Support Vector Machine, loss="modified_huber": smoothed hinge loss, loss="log": logistic regression

answered Nov 02 '17 at 14:16

IMN

57
8

score 3 · Answer 2 · answered Aug 02 '17 at 08:57

SGD is indeed a technique that is used to find the minima of a function. SGDClassifier is a linear classifier (by default in sklearn it is a linear SVM) that uses SGD for training (that is, looking for the minima of the loss using SGD). According to the documentation:

SGDClassifier is a Linear classifiers (SVM, logistic regression, a.o.) with SGD training.

This estimator implements regularized linear models with stochastic gradient descent (SGD) learning: the gradient of the loss is estimated each sample at a time and the model is updated along the way with a decreasing strength schedule (aka learning rate). SGD allows minibatch (online/out-of-core) learning, see the partial_fit method. For best results using the default learning rate schedule, the data should have zero mean and unit variance.

This implementation works with data represented as dense or sparse arrays of floating point values for the features. The model it fits can be controlled with the loss parameter; by default, it fits a linear support vector machine (SVM).

score 0 · Answer 3 · edited Aug 16 '20 at 10:46

Lets seggregate each word in simple english meaning

Stochastic - Random, Gradient - slope, Descent - downwards

Basically, this technique is used as an "optimizing algorithm" for finding the parameters with minimal convex loss/cost function.

By which we can find out the slope of the line which has minimal loss for linear classifiers i.e. (SVM & Logistic Regression)

Some of the other ways in which it is performed:

Batch Gradient Descent
Stochastic Gradient Descent
Mini-batch

For more details, on the above please go through the referred link: https://www.geeksforgeeks.org/ml-stochastic-gradient-descent-sgd/

Is Stochastic gradient descent a classifier or an optimizer?

3 Answers3