0

This document shows that a XGBoost API trained model can be sliced by following code:

from sklearn.datasets import make_classification
import xgboost as xgb

booster = xgb.train({
    'num_parallel_tree': 4, 'subsample': 0.5, 'num_class': 3},
                    num_boost_round=num_boost_round, dtrain=dtrain)    
sliced: xgb.Booster = booster[3:7]

I tried it and it worked.

Since XGBoost provides Scikit-Learn Wrapper interface, I tried something like this:

from xgboost import XGBClassifier

clf_xgb = XGBClassifier().fit(X_train, y_train)
clf_xgb_sliced: clf_xgb.Booster = booster[3:7]

But got following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-18-84155815d877> in <module>
----> 1 clf_xgb_sliced: clf_xgb.Booster = booster[3:7]

AttributeError: 'XGBClassifier' object has no attribute 'Booster'

Since XGBClassifier has no attribute 'Booster', is there any way to slice a Scikit-Learn Wrapper interface trained XGBClassifier(/XGBRegressor) model?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
SC Chen
  • 23
  • 5

1 Answers1

2

The problem is with the type hint you are giving clf_xgb.Booster which does not match an existing argument. Try:

clf_xgb_sliced: xgb.Booster = clf_xgb.get_booster()[3:7]

instead.

Learning is a mess
  • 7,479
  • 7
  • 35
  • 71
  • Thank you. I think I forgot the 'import' information of first code segment. I edited my post for this information. The first code segment shows it works for a XGBoost model trained with the XGBoost API. The second code segment shows that I want to do similar things to a XGBoost model trained with Scikit-Learn Wrapper interface but failed. – SC Chen May 06 '22 at 11:43
  • In the code you share all I see is the AttributeError I helped you fix, can you share teh code for the sklearn case? – Learning is a mess May 06 '22 at 12:16
  • Thanks for your question. In the second code segment, `XGBClassifier().fit(...` is used and it is the sklearn (wrapper interface) case. The API doc of sklearn wrapper interface of XGBoost is [here](https://xgboost.readthedocs.io/en/stable/python/python_api.html#module-xgboost.sklearn) for reference. I use sklearn wrapper interface to train a XGBoost model more often, but I cannot slice this kind of XGBoost model like what has been done in the [document](https://xgboost.readthedocs.io/en/stable/python/model.html) I found (as at the beginning of my post). I hope there is a way to do this. – SC Chen May 06 '22 at 13:21
  • Okay I finally understood your question (I think so at least) and have updated my answer accordingly, please take a look. – Learning is a mess May 06 '22 at 13:42
  • Thank you so much for this idea. It works this way. But it becomes a xgboost.Booster model, not a xgboost.XGBClassifier model any more. If I want to predict, I have to use [xgboost.Booster.predict](https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.Booster.predict) but I am used to predict with [xgboost.XGBClassifier.predict()](https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.XGBClassifier.predict). Do you know how to convert the sliced xgboost.Booster back to a xgboost.XGBClassifier? – SC Chen May 06 '22 at 14:45
  • Yeah I know booster != classifier, though if you are only intrested in predicting both will give you the same predictions. To convert, it's not pretty but reliable [this way](https://stackoverflow.com/questions/57681700/xgboost-get-classifier-object-form-booster-object) – Learning is a mess May 06 '22 at 14:48
  • Thank you! It works! I try `new_clf_xgb_replace_booster = XGBClassifier() --> new_clf_xgb_replace_booster._Booster = clf_xgb_sliced` according to your link, and can convert `clf_xgb_sliced` (a `xgboost.Booster` model) back to `new_clf_xgb_replace_booster` (a `xgboost.XGBClassifier` model). There is an interesting thing that clf_xgb_sliced.predict() outputs results like array([[0.26339364, 0.21337277, 0.5232336 ], [0.16507962, 0.5867635 , 0.24815688], ..., [0.20555478, 0.22729845, 0.5671467 ]], dtype=float32) while new_clf_xgb_replace_booster.prodict() outputs array([2, 1, ..., 2], dtype=int64) – SC Chen May 06 '22 at 18:39