2

I'm using sklearn to do some machine learning. I often use GridSearchCV to explore hyperparameters and perform cross-validation. Using this, I can specify a scoring function, like this:

scores = -cross_val_score(svr, X, Y, cv=10, scoring='neg_mean_squared_error')

However, I want to train my SVR model using mean squared error. Unfortunately, there's no scoring parameter in either the constructor for SVR or the fit method.

How should I do this?

Thanks!

anon_swe
  • 8,791
  • 24
  • 85
  • 145
  • So you actually want a custom objective function or loss function, not scoring. See [this similar question](https://stackoverflow.com/questions/45698160/python-svm-function-with-huber-loss) and [this scikit issue](https://github.com/scikit-learn/scikit-learn/issues/1701) – Vivek Kumar Mar 10 '18 at 02:24

1 Answers1

0

I typically use Pipeline to do it. You can create list of pipelines including SVR model (and others if you want). Then, you can apply GridSearchCV where putting pipeline in as your argument.

Here, you can add params_grid where searching space can be defined as pipelinename__paramname (double underscore in between). For example, I have pipeline name svr and I want to search on parameter C, I can put the key in my parameter dictionary as svr__C.

from sklearn.pipeline import Pipeline
from sklearn.model_selection import cross_val_score, GridSearchCV
from sklearn.svm import SVR

c_range = np.arange(1, 10, 1)
pipeline = Pipeline([('svr', SVR())])
params_grid = {'svr__C': c_range}
# grid search with 3-fold cross validation
gridsearch_model = GridSearchCV(pipeline, params_grid, 
                                cv=3, scoring='neg_mean_squared_error')

Then, you can do the same procedure by fitting training data and find best score and parameters

gridsearch_model.fit(X_train, y_train)
print(gridsearch_model.best_params_, gridsearch_model.best_score_)

You can also use cross_val_score to find the score:

cross_val_score(gridsearch_model, X_train, y_train, 
                cv=3, scoring='neg_mean_squared_error')

Hope this helps!

titipata
  • 5,321
  • 3
  • 35
  • 59
  • Thanks! So when you call `gridsearch_model.fit(X_train, y_train)`, is that training the model with `neg_mean_squared_error`? – anon_swe Mar 10 '18 at 16:26
  • Yes, I believe so! – titipata Mar 10 '18 at 16:33
  • 1
    Great! So there's no general scoring function to pass into fit? I don't see it in any of the docs for `LinearRegression`, `svm.SVR`, etc. Seems crazy! Don't think I should have to do GridSearchCV just to say how I want the loss calculated... – anon_swe Mar 10 '18 at 16:34
  • Ah, I see. So you just want to calculate `neg_mean_squared_error `. I guess `cross_val_score` also have it implemented? Let me check quickly – titipata Mar 10 '18 at 16:40