2

I have built a model using scikit-learn's AdaBoostClassifier with Logistic regression as the base estimator.

model = AdaBoostClassifier(base_estimator=linear_model.LogisticRegression()).fit(X_train, Y_train)

How do I obtain the coefficients of the model? I want to see how much each feature will contribute numerically towards the target variable log(p/(1-p)).

Many thanks.

Leockl
  • 1,906
  • 5
  • 18
  • 51

1 Answers1

2

Adaboost have an estimators_ attribute that allows you to iterate over all fitted base learners. And, you can use coef_ parameter of each base learner to get the coefficients assigned to each feature. You can then average the coefficients. Note, you'll have to take into account the fact that Adaboost's base learners are assigned individual weight.

coefs = []
for clf,w in zip(model.estimators_,model.estimator_weights_):
    coefs.append(clf.coef_*w)
coefs = np.array(coefs).mean(axis=0)
print(coefs)

If you've got binary classification, you might want to change the line inside the loop as:

coefs.append(clf.coef_.reshape(-1)*w)
Shihab Shahriar Khan
  • 4,930
  • 1
  • 18
  • 26
  • Many thanks! Question: Since AdaBoostClassifier is a boosting technique where it successively boost previous base learners, shouldn't the final model be the last model in the successive chain of base learners, therefore we should be taking the `coef_` of the final base learner rather than taking the mean of the `coef_` of all base learners? – Leockl Feb 24 '20 at 01:02
  • 1
    I'm not sure what you mean by "successively boost previous base learners", but at each step Adaboost finds set of instances that *all* current estimators *combined* fail to classify. Final classifier is then weighted sum of all base learners, not just the last one – Shihab Shahriar Khan Feb 24 '20 at 06:59
  • Thanks Shihad. Apologies, I had my AdaBoost algorithm mixed up. But I am clear now. Also, since there are weights involved, shouldn't the mean be calculated as a weighted mean, ie. `coefs = average(coefs, axis=0, weights=w)` – Leockl Feb 26 '20 at 03:26
  • are you trying to replace this line: `coefs = np.array(coefs).mean(axis=0)`? if so, we already achieve weighted mean in the previous line `coefs.append(clf.coef_*w)` – Shihab Shahriar Khan Feb 26 '20 at 08:48