1

I would like to export decision tree using sklearn.

First I trained a decision tree classifier:

self._selected_classifier = tree.DecisionTreeClassifier()
self._selected_classifier.fit(train_dataframe, train_class)

self._column_names = list(train_dataframe.columns.values)

After that I used the following method in order to export the decision tree:

def _create_graph_visualization(self):
    decision_tree_classifier = self._selected_classifier 

    from sklearn.externals.six import StringIO
    dot_data = StringIO()
    tree.export_graphviz(decision_tree_classifier,
                         out_file=dot_data,
                         feature_names=self._column_names)
    import pydotplus
    graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
    graph.write_pdf("decision_tree_output.pdf")

After many errors regarding missing executables now the program is finished successfully. The file is created, but it is empty. What am I doing wrong?

tfv
  • 6,016
  • 4
  • 36
  • 67
Aviade
  • 2,057
  • 4
  • 27
  • 49
  • You might get helpo quicker if you included some data so that anyone can just run your code and see the error. – tfv Oct 03 '16 at 06:00

1 Answers1

0

Here is an example with output which works for me, using pydotplus:

from sklearn import tree  
import pydotplus
import StringIO

# Define training and target set for the classifier
train = [[1,2,3],[2,5,1],[2,1,7]]
target = [10,20,30]

# Initialize Classifier. Random values are initialized with always the same random seed of value 0 
# (allows reproducible results)
dectree = tree.DecisionTreeClassifier(random_state=0)
dectree.fit(train, target)

# Test classifier with other, unknown feature vector
test = [2,2,3]
predicted = dectree.predict(test)

dotfile = StringIO.StringIO()
tree.export_graphviz(dectree, out_file=dotfile)
graph=pydotplus.graph_from_dot_data(dotfile.getvalue())
graph.write_png("dtree.png")
graph.write_pdf("dtree.pdf")
tfv
  • 6,016
  • 4
  • 36
  • 67
  • 1
    If I use your way, I get an error: graph.write_pdf("dtree.pdf") Expected {'graph' | 'digraph'} (at char 0), (line:1, col:1) AttributeError: 'NoneType' object has no attribute 'write_pdf' – Jan Sila Oct 10 '16 at 22:25
  • 1
    any idea how to avoid it? – Jan Sila Oct 10 '16 at 22:26