5

I have a problem. I want to create a process with a heatmap. To see how long each step took. I created the process with PyDot and created a dataframe for the individuall steps.

How could I create a heatmap for my process?

The calculation should be also include the from-step-to-step time. So you can calculate the edges time e.g task1_start - start / task2_start - task1_end And you can calculate the nodes time e.g. task1_end - task1_start / task2_end - task2_start.

My MVP only changes the color of the border. But I want to create a real heatmap.

enter image description here

Process

import pydot
from IPython.display import SVG

graph = pydot.Dot(graph_type='digraph')

task_node1 = pydot.Node("Task1", shape="box",)
task_node2 = pydot.Node("Task2", shape="box",)



graph.add_node(task_node1)
graph.add_node(task_node2)


task1_to_task2_edge = pydot.Edge("Task1", "Task2",)


graph.add_edge(task1_to_task2_edge)


graph.write_svg("diagram.svg")
SVG('diagram.svg')

enter image description here

Dataframe


   id         step   timestamp
0   1  task1_start  2023-01-01
1   1    task1_End  2023-01-05
2   1  task2_start  2023-01-10
3   1    task2_end  2023-01-12
4   2  task1_start  2023-01-01
5   2    task1_End  2023-01-05
6   2  task2_start  2023-01-10
7   2    task2_end  2023-01-12

MVP

import pandas as pd 
d = {'id': [1, 1, 1, 1,
            2, 2, 2, 2,],
    'step': ['task1_start', 'task1_End', 'task2_start', 'task2_end',
              'task1_start', 'task1_End', 'task2_start', 'task2_end',],
     'timestamp': ['2023-01-01', '2023-01-05', '2023-01-10', '2023-01-12',
               '2023-01-01', '2023-01-05', '2023-01-10', '2023-01-12',]}

df  = pd.DataFrame(data=d,)

df['timestamp'] = pd.to_datetime(df['timestamp'])

g = df.groupby('id')

out = (df
    .assign(duration=df['timestamp'].sub(g['timestamp'].shift()),
            step=lambda d: (df['step']+'/'+g['step'].shift()).str.replace(
                 r'([^_]+)[^/]*/([^_]+)[^/]*',
                 lambda m: m.group(1) if m.group(1)==m.group(2) else f"{m.group(2)}_to_{m.group(1)}",
                 regex=True)
           )
   [['id', 'step', 'duration']].dropna(subset=['duration'])
)

df = out

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors


colors = mcolors.LinearSegmentedColormap.from_list(
    'LightBlueGreenYellowRed', ['#B0E0E6', '#87CEEB', '#00FF00', '#ADFF2F', '#FFFF00', '#FFD700', '#FFA500', '#FF4500', '#FF0000', '#FF6347', '#FF7F50', '#FFA07A', '#FFC0CB', '#FFB6C1', '#FF69B4', '#DB7093', '#FF1493', '#C71585', '#FF00FF']
)

def get_color(value, vmin, vmax):
    norm = (value - vmin) / (vmax - vmin)
    cmap = colors(norm)
    return mcolors.to_hex(cmap)

vmin = df['duration'].min()
vmax = df['duration'].max()
df['color'] = df['duration'].apply(lambda x: get_color(x, vmin, vmax))

def get_color(id):
    if (df['step'] == id).any():
        color = df.loc[df['step'] == id, 'color'].values[0]
        if pd.isnull(color):
            return '#808080' 
        else:
            return color
    else:
        return '#808080'  
import pydot
from IPython.display import SVG

graph = pydot.Dot(graph_type='digraph')

task_node1 = pydot.Node("Task1", shape="box", color = get_color('task1'))
task_node2 = pydot.Node("Task2", shape="box", color = get_color('task2'))



graph.add_node(task_node1)
graph.add_node(task_node2)


task1_to_task2_edge = pydot.Edge("Task1", "Task2", color = get_color('task1_to_task2'))


graph.add_edge(task1_to_task2_edge)


graph.write_svg("diagram.svg")
SVG('diagram.svg')

enter image description here

Test
  • 571
  • 13
  • 32

2 Answers2

0

For drawing the heatmap, use the SVG export and add class names to the nodes to mark how hot they are. You then can include that SVG group twice and use a filter to give something like your heatmap, by having a background filled with colours and blurred and the normal black and white version as foreground.

<svg height="110" width="110" xmlns="http://www.w3.org/2000/svg">

    <style>
        .foreground *{
        fill: none;
        stroke: black;
        }
        .foreground text{
        fill: black;
        stroke: none;
        }
        .background *{
        stroke: none;
        }

        .background text { display: none; }
        .background .heat-1 { fill: #00ff00; }
        .background *.heat-2 { fill: #ff0000; }
    </style>


    <defs>
        <filter id="f1" x="0" y="0">
            <feGaussianBlur in="SourceGraphic" stdDeviation="15" />
        </filter>

    </defs>
    <g transform="scale(1 1) rotate(0) translate(4 112)">
        <g class='background' filter='url(#f1)'>

            <g id="graph0" class="graph">
                <!-- a -->
                <g id="node1" class="node heat-1">
                    <title>a</title>
                    <ellipse cx="27" cy="-90" rx="27" ry="18" />
                    <text text-anchor="middle" x="27" y="-86.3" font-family="Times,serif"
                        font-size="14.00">a</text>
                </g>
                <!-- b -->
                <g id="node2" class="node heat-2">
                    <title>b</title>
                    <ellipse cx="27" cy="-18" rx="27" ry="18" />
                    <text text-anchor="middle" x="27" y="-14.3" font-family="Times,serif"
                        font-size="14.00">b</text>
                </g>
                <!-- a&#45;&gt;b -->
                <g id="edge1" class="edge">
                    <title>a&#45;&gt;b</title>
                    <path d="M27,-71.7C27,-63.98 27,-54.71 27,-46.11" />
                    <polygon fill="black" stroke="black"
                        points="30.5,-46.1 27,-36.1 23.5,-46.1 30.5,-46.1" />
                </g>
            </g>
        </g>
        <g class='foreground'>

            <g id="graph0" class="graph" >
                <!-- a -->
                <g id="node1" class="node heat-1">
                    <title>a</title>
                    <ellipse cx="27" cy="-90" rx="27" ry="18" />
                    <text text-anchor="middle" x="27" y="-86.3" font-family="Times,serif"
                        font-size="14.00">a</text>
                </g>
                <!-- b -->
                <g id="node2" class="node heat-2">
                    <title>b</title>
                    <ellipse cx="27" cy="-18" rx="27" ry="18" />
                    <text text-anchor="middle" x="27" y="-14.3" font-family="Times,serif"
                        font-size="14.00">b</text>
                </g>
                <!-- a&#45;&gt;b -->
                <g id="edge1" class="edge">
                    <title>a&#45;&gt;b</title>
                    <path d="M27,-71.7C27,-63.98 27,-54.71 27,-46.11" />
                    <polygon fill="black" stroke="black"
                        points="30.5,-46.1 27,-36.1 23.5,-46.1 30.5,-46.1" />
                </g>
            </g>
        </g>
    </g>
</svg>

It would be nice if you could just use a ref to the graph group twice rather than having it included twice, but I couldn't get a CSS expression to treat the group differently if it was used rather than inlined.

Pete Kirkham
  • 48,893
  • 5
  • 92
  • 171
0

All you should do is to use two attributes: fillcolor and style='filled'

task_node1 = pydot.Node("Task1", shape="box", fillcolor = get_color('task1'), style='filled')
task_node2 = pydot.Node("Task2", shape="box", fillcolor = get_color('task2'), style='filled')

You can look at this small example over there

This is the result:

enter image description here

Hamzah
  • 8,175
  • 3
  • 19
  • 43