1

I am running this in Databricks but the decision tree image will not display.

%pip install pydot
%pip install pydotplus

# Load libraries
from sklearn.tree import DecisionTreeClassifier
from sklearn import datasets
from IPython.display import Image  
from sklearn import tree
import pydotplus

# Load data
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Create DOT data
dot_data = tree.export_graphviz(clf, out_file=None, 
                                feature_names=iris.feature_names,  
                                class_names=iris.target_names)

# Draw graph
graph = pydotplus.graph_from_dot_data(dot_data)  

# Show graph
Image(graph.create_png())

I only get this message (no visual): Out[4]: <IPython.core.display.Image object>

I'm stumped. Thoughts?

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
user12814878
  • 45
  • 1
  • 4

1 Answers1

3

Databricks has the worst documentation, and their examples do not work at this time, so I had to come up with my own solution using PIL and Matplotlib.

Here is how I display images in Databricks in Python:

from PIL import Image
import matplotlib.pyplot as plt

def display_image(path, dpi=100):
    """
    Description:
        Displayes an image
    Inputs:
        path (str): File path
        dpi (int): Your monitor's pixel density
    """
    img = Image.open(path)
    width, height = img.size
    plt.figure(figsize = (width/dpi,height/dpi))
    plt.imshow(img, interpolation='nearest', aspect='auto')
Esben Eickhardt
  • 3,183
  • 2
  • 35
  • 56
  • This function looks good but I'm wondering if there is any way we can use this for abfss storage in data lake? Any idea? – hkay Dec 13 '22 at 03:16
  • If you have you data lake mounted on databricks, this same function will work. The function does "in memory" operations, thus it doesn't care where the data comes from. – Esben Eickhardt Dec 13 '22 at 07:59