-2

Since I'm new to data science, I just want to know that is there any specific data behavior that is responsible for overfitting and/or underfitting? Because if we are dealing with linear regression and we are supposed to get the Best fit line through gradient descent. Now, how can we get overfitting or underfitting? I know what is overfitting and underfitting but the problem is that how is it possible when you already applied gradient descent to get best fit line. I hope my question would be cleared to all, by the way.

Thanks and regards.

raobabar
  • 13
  • 2
  • Welcome to StackOverflow. Please follow the posting guidelines in the help documentation, as suggested when you created this account. [On topic](https://stackoverflow.com/help/on-topic), [how to ask](https://stackoverflow.com/help/how-to-ask), and ... [the perfect question](https://codeblog.jonskeet.uk/2010/08/29/writing-the-perfect-question/) apply here. You're using terms that don't particularly go together. – Prune Nov 06 '19 at 18:20
  • In particular, linear regression can be solved with a direct computation. Using gradient descent in multiple dimensions is typically done by applying modifications of Newton-Raphson to a quadratic error function. Such models do not have the complex search space that is susceptible to **overfitting**. – Prune Nov 06 '19 at 18:24
  • Please detail your situation, including a [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). – Prune Nov 06 '19 at 18:25
  • I'm voting to close this question as off-topic because it is not about programming as defined in the guidelines. – desertnaut Nov 07 '19 at 15:08

2 Answers2

0

Less number of samples in the data can be a major reason for model over-fitting. Even if your model is simple, less variance (or variation) in the data samples can make the model learn to perform well for "only" those samples, and may not generalize well.

Anant Mittal
  • 1,923
  • 9
  • 15
0

We can detect over fitting on a linear model by looking at the no. of features and the training error as well as the testing error.

If the model over fits:
1. Enough data is been provided for training i.e more no. of features used to train.
2. Training error is very less than the testing error.

If the model under fits:
1. Less data is been provided for training i.e less no. of features used to train.
2. Test error is very less than training error.

Using Gradient Descent is a good option.But it may lead to Over fitting and fail on real life data.

Hope this may help.