9

I've trained a gradient boost classifier, and I would like to visualize it using the graphviz_exporter tool shown here.

When I try it I get:

AttributeError: 'GradientBoostingClassifier' object has no attribute 'tree_'

this is because the graphviz_exporter is meant for decision trees, but I guess there's still a way to visualize it, since the gradient boost classifier must have an underlying decision tree.

How to do that?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Carlos Pinzón
  • 1,286
  • 2
  • 15
  • 24
  • 1
    have you tried to use XGBoost [link](http://machinelearningmastery.com/visualize-gradient-boosting-decision-trees-xgboost-python/)? – seralouk Jul 07 '17 at 15:38
  • Thanks for introducing me to the XGBoost library. I'll give it a check, although I found how to do it using sklearn – Carlos Pinzón Jul 07 '17 at 16:08

2 Answers2

21

The attribute estimators contains the underlying decision trees. The following code displays one of the trees of a trained GradientBoostingClassifier. Notice that although the ensemble is a classifier as a whole, each individual tree computes floating point values.

from sklearn.ensemble import GradientBoostingClassifier
from sklearn.tree import export_graphviz
import numpy as np

# Ficticuous data
np.random.seed(0)
X = np.random.normal(0,1,(1000, 3))
y = X[:,0]+X[:,1]*X[:,2] > 0

# Classifier
clf = GradientBoostingClassifier(max_depth=3, random_state=0)
clf.fit(X[:600], y[:600])

# Get the tree number 42
sub_tree_42 = clf.estimators_[42, 0]

# Visualization
# Install graphviz: https://www.graphviz.org/download/
from pydotplus import graph_from_dot_data
from IPython.display import Image
dot_data = export_graphviz(
    sub_tree_42,
    out_file=None, filled=True, rounded=True,
    special_characters=True,
    proportion=False, impurity=False, # enable them if you want
)
graph = graph_from_dot_data(dot_data)
png = graph.create_png()
# Save (optional)
from pathlib import Path
Path('./out.png').write_bytes(png)
# Display
Image(png)

Tree number 42:

Code output (decision tree image)

Carlos Pinzón
  • 1,286
  • 2
  • 15
  • 24
1

To add to the existing answer, there is another nice visualization package called dtreeviz which I find really useful.

Borrowing code from the existing answer:

from sklearn.ensemble import GradientBoostingClassifier
import numpy as np
from dtreeviz.trees import *

# Ficticuous data
np.random.seed(0)
X = np.random.normal(0,1,(1000, 3))
y = X[:,0]+X[:,1]*X[:,2] > 0

# Classifier
clf = GradientBoostingClassifier(max_depth=3, random_state=0)
clf.fit(X[:600], y[:600])

# Get the tree number 42
sub_tree_42 = clf.estimators_[42, 0]

# Visualization
viz = dtreeviz(sub_tree_42,
               x_data=X,
               y_data=y,
               target_name='Positive',
               feature_names=['X0', 'X1', 'X2'],
               class_names=['Negative', 'Positive'],
               title='Tree 42 visualization')

viz.save("tree_visualization.svg") 
viz.view()

enter image description here

Dudelstein
  • 383
  • 3
  • 16