13

Before building a model I make scaling like this

X = StandardScaler(with_mean = 0, with_std = 1).fit_transform(X)

and after build a features importance plot

xgb.plot_importance(bst, color='red')
plt.title('importance', fontsize = 20)
plt.yticks(fontsize = 10)
plt.ylabel('features', fontsize = 20)

enter image description here

The problem is that instead of feature's names we get f0, f1, f2, f3 etc..... How to return feature's names?

thanks

Edward
  • 4,443
  • 16
  • 46
  • 81

3 Answers3

16

first we get list of feature names before preprocessing

dtrain = xgb.DMatrix( X, label=y)
dtrain.feature_names

Then

bst.get_fscore()
mapper = {'f{0}'.format(i): v for i, v in enumerate(dtrain.feature_names)}
mapped = {mapper[k]: v for k, v in bst.get_fscore().items()}
mapped
xgb.plot_importance(mapped, color='red')

that's all

Edward
  • 4,443
  • 16
  • 46
  • 81
  • 5
    I needed to use `bst.booster().get_score().items()` instead of `bst.get_fscore().items()`, when the `bst` is instance of `XGBClassifier`. – corochann Oct 24 '17 at 06:18
  • 2
    I needed to use bst.get_booster().get_score().items() – Masih Jul 14 '19 at 02:57
1

For xgboost 0.82, the answer is quite simple, just overwrite the feature names attribute with the list of feature name strings.

trained_xgbmodel.feature_names = feature_name_list
xgboost.plot_importance(trained_xgbmodel)
SriK
  • 1,011
  • 1
  • 15
  • 29
0

You can retrieve the importance of Xgboost model (trained with scikit-learn like API) with:

xgb.feature_importances_

To check what type of importance it is: xgb.importance_type. The importance type can be set in the Xgboost constructor. You can read about ways to compute feature importance in Xgboost in this post.

pplonski
  • 5,023
  • 1
  • 30
  • 34