9

I am running a regression using the XGBoost Algorithm as,

clf = XGBRegressor(eval_set = [(X_train, y_train), (X_val, y_val)],
                       early_stopping_rounds = 10, 
                       n_estimators = 10,                    
                       verbose = 50)

clf.fit(X_train, y_train, verbose=False)
print("Best Iteration: {}".format(clf.booster().best_iteration))

It correctly trains itself, but the print function raises the following error,

TypeError: 'str' object is not callable

How can I get the number of the best iteration of the model?

Furthermore, how can I print the training error of each round?

Ray
  • 3,864
  • 7
  • 24
  • 36
Alessandro Ceccarelli
  • 1,775
  • 5
  • 21
  • 41

2 Answers2

15

For your TypeError: use get_booster() instead of booster()

print("Best Iteration: {}".format(clf.get_booster().best_iteration))

To use the number of the best iteration when you predict, you have a parameter called ntree_limit which specify the number of boosters to use. And the value generated from the training process is best_ntree_limit which can be called after training your model in the following matter: clg.get_booster().best_ntree_limit. More specifically when you predict, use:

best_iteration = clg.get_booster().best_ntree_limit
predict(data, ntree_limit=best_iteration)

You can print your training and evaluating process if you specify those parameters in the .fit() command

clf.fit(X_train, y_train,
        eval_set = [(X_train, y_train), (X_val, y_val)],
        eval_metric = 'rmse',
        early_stopping_rounds = 10, verbose=True)

NOTE: early_stopping_rounds parameter should be in the .fit() command not in the XGBRegressor() instantiation.

Another NOTE: verbose = 50 in XGBRegressor() is redundant. The verbose variable should be in your .fit() function and is True or False. For what the verbose=True do, read here under the verbose section. It is directly affects your 3rd question.

Eran Moshe
  • 3,062
  • 2
  • 22
  • 41
3

Your error is that the booster attribute of XGBRegressor is a string that specifies the kind of booster to use, not the actual booster instance. From the docs:

booster: string
Specify which booster to use: gbtree, gblinear or dart.

In order to get the actual booster, you can call get_booster() instead:

>>> clf.booster
'gbtree'
>>> clf.get_booster()
<xgboost.core.Booster object at 0x118c40cf8>
>>> clf.get_booster().best_iteration
9
>>> print("Best Iteration: {}".format(clf.get_booster().best_iteration))
Best Iteration: 9

I'm not sure about the second half of your question, namely:

Furthermore, how can I print the training error of ** each round**?

but hopefully you're unblocked!

Samuel Dion-Girardeau
  • 2,790
  • 1
  • 29
  • 37