interpretation of neg_mean_squared_error

Question

I have a small doubt regarding the neg_mean_squared_error of sklearnmetrics. I am using a regression model Ridge with a cross validation

cross_val_score(estimator, X_train, y_train, cv=5, scoring='neg_mean_squared_error')

i am using different values of alphas to choose to best model.

alphas= (0.01, 0.05, 0.1, 0.3, 0.8, 1, 5, 10, 15, 30, 50)

I calculate the mean value of of the 5 values returned by the cross_val_score and I plotted them in this figure (mean value of the score is the y axis, alphas is the x axis)

Doing some research I see that with neg_mean_squared_error, we need to look for 'the smaller the better' does it mean I have to look for the smallest value "litterally", which would be the first value in my graph, or does it mean the smallest in terms of 'closest to 0'

in my case all values are negative, that is why i have a doubt about the interpretation

thank you very much

I'd also suggest [this post](https://stackoverflow.com/questions/48244219/is-sklearn-metrics-mean-squared-error-the-larger-the-better-negated) if you want to expand further — amiola, Nov 13 '21 at 16:59

score 2 · Answer 1 · answered Nov 13 '21 at 16:53

From the docs

All scorer objects follow the convention that higher return values are better than lower return values. Thus metrics which measure the distance between the model and the data, like metrics.mean_squared_error, are available as neg_mean_squared_error which return the negated value of the metric.

So what you want is the maximum of your values, i.e. closest to 0.

score 1 · Accepted Answer · answered Nov 13 '21 at 17:03

Scikit-learn considers by convention that a score follow the rule: 'higher values are better than lower values'. In this case a small MSE shows that your predictions are close to data so it follows the opposite rule. That's why sklearn consider the negative (actually opposite) MSE as score. Thus a big neg_mean_squared_error is better than a low one. It is also coherent with your graph because extreme values for parameters generally degrades a model.

Screen from the Scikit-learn website that indicates precisely the things:

interpretation of neg_mean_squared_error

2 Answers2