Could you please explain what the "fit" method in scikit-learn does? Why is it useful?
1 Answers
In a nutshell: fitting is equal to training. Then, after it is trained, the model can be used to make predictions, usually with a .predict()
method call.
To elaborate: Fitting your model to (i.e. using the .fit()
method on) the training data is essentially the training part of the modeling process. It finds the coefficients for the equation specified via the algorithm being used (take for example umutto's linear regression example, above).
Then, for a classifier, you can classify incoming data points (from a test set, or otherwise) using the predict
method. Or, in the case of regression, your model will interpolate/extrapolate when predict
is used on incoming data points.
It also should be noted that sometimes the "fit" nomenclature is used for non-machine-learning methods, such as scalers and other preprocessing steps. In this case, you are merely "applying" the specified function to your data, as in the case with a min-max scaler, TF-IDF, or other transformation.
Note: here are a couple of references...

- 1,364
- 1
- 11
- 9
-
What happen if the fit method is called twice with different data? – AleGallagher Dec 30 '22 at 13:27
-
The result is just a different model, i.e. a different set of coefficients to represent the new set of data supplied. The previous result is overwritten unless you define and pass it into a new object with different name. If you DO pass it into a separate variable, then you can make comparisons of the two models. If they have the same number of dimensions and represent the same or similar populations, these comparisons can be very fruitful. – Kevin Glynn Mar 23 '23 at 19:15