How can we interpret feature importances for Stochastic Gradient Descent Classifier?

Question

I have a SGDClassifier model trained with scikit-learn. I extract features names with .get_feature_names() and coefficients with .coef_

I combine the 2 columns in a dataframe like this :

feature     value
hiroshima   3.918584
wildfire    3.287680
earthquake  3.256817
massacre    3.186762
storm       3.124809
...         ...
job         -1.696438
song        -1.736640   
as          -1.956571   
nowplaying  -2.028240   
write       -2.263968

I want to know how I can interpret the features importances ? What does a positive high value mean? What does a low negative value mean?

I’m voting to close this question because it is not about programming as defined in the [help] but about ML theory and/or methodology. — desertnaut, Mar 11 '21 at 00:56

score 1 · Answer 1 · answered Mar 11 '21 at 01:32

SGDClassifier fits a linear model, meaning that the decision is essentially based on

SUM_i w_i f_i + b

where w_i is the weight attached to feature f_i, consequently you can interpret these numbers as literally "votes" for positive/negative class at the scale proportional to their absolute value. All that your classifier does is to add these weights, and then it adds _intercept value from your model, and classifies based on the sign.

How can we interpret feature importances for Stochastic Gradient Descent Classifier?

1 Answers1