0

I'm pretty sure my code is fine, btu I can't generate a plot of a simple Sankey Chart. Maybe something is off with the code, not sure. Here's what I have now. Can anyone see a problem with this?

import pandas as pd
import holoviews as hv
import plotly.graph_objects as go
import plotly.express as pex
hv.extension('bokeh')

data = [['TMD','TMD Create','Sub-Section 1',17],['TMD','TMD Create','Sub-Section 1',17],['C4C','Customer Tab','Sub-Section 1',10],['C4C','Customer Tab','Sub-Section 1',10],['C4C','Customer Tab','Sub-Section 1',17]]
df = pd.DataFrame(data, columns=['Source','Target','Attribute','Value'])
df

source = df["Source"].values.tolist()
target = df["Target"].values.tolist()
value = df["Value"].values.tolist()
labels = df["Attribute"].values.tolist()

import plotly.graph_objs as go

#create links
link = dict(source=source, target=target, value=value, 
color=["turquoise","tomato"] * len(source))

#create nodes
node = dict(label=labels, pad=15, thickness=5)

#create a sankey object
chart = go.Sankey(link=link, node=node, arrangement="snap")

#build a figure
fig = go.Figure(chart)
fig.show()

I am trying to follow the basic example shown in the link below.

https://python.plainenglish.io/create-a-sankey-diagram-in-python-e09e23cb1a75

bigreddot
  • 33,642
  • 5
  • 69
  • 122
ASH
  • 20,759
  • 19
  • 87
  • 200

1 Answers1

1

You are mentioning two different packages, and both need different solutions. I don't know which you perefer, so I explain both.

Data

import pandas as pd
df = pd.DataFrame({
    'Source':['a','a','b','b'],
    'Target':['c','d','c','d'],
    'Value': [1,2,3,4]
})
>>> df
  Source Target  Value
0      a      c      1
1      a      d      2
2      b      c      3
3      b      d      4

This is a very basic DataFrame with only 4 transitions.

Holoviews/Bokeh

With holoviews it is very easy to plot a sanky diagram, because it takes the DataFrame as it is and gets the labels by the letters in the Source and Target column.

import holoviews as hv
hv.extension('bokeh')

sankey = hv.Sankey(df)
sankey.opts(width=600, height=400)

This is created with holoviews 1.15.4 and bokeh 2.4.3.

sanky with holoviews

Plotly

For plotly it is not so easy, because plotly wants numbers instead of letters in the Source and Target column. Therefor we have to manipulate the DataFrame first before we can create the figure.

Here I collect all different labels and replace them by a unique number.

unique_labels = set(list(df['Source'].unique()) + list(df['Target'].unique()))
mapper = {v: i for i, v in enumerate(unique_labels)}
df['Source'] = df['Source'].map(mapper)
df['Target'] = df['Target'].map(mapper
>>> df
   Source  Target  Value
0       0       2      1
1       0       3      2
2       1       2      3
3       1       3      4

Afterwards I can create the dicts which plotly takes. I have to set the lables by hand and the length of the arrays have to match.

source = df["Source"].values.tolist()
target = df["Target"].values.tolist()
value = df["Value"].values.tolist()

#create links
link = dict(source=source, target=target, value=value, color=["turquoise","tomato"] * 2)

#create nodes
node = dict(label=['a', 'b', 'c', 'd'], pad=15, thickness=5)

#create a sankey object
chart = go.Sankey(link=link, node=node, arrangement="snap")

#build a figure
fig = go.Figure(chart)
fig.show()

I used plotly 5.13.0.

sanky with ploty

mosc9575
  • 5,618
  • 2
  • 9
  • 32
  • 1
    That works! Thanks for the help here!! I found this site to be a great resource as well. https://coderzcolumn.com/tutorials/data-science/how-to-plot-sankey-diagram-in-python-jupyter-notebook-holoviews-and-plotly – ASH Jan 27 '23 at 03:57
  • To use the latsest version of `bokeh` you have to update `panel` to 1.0 or higher and `holoviews` to 0.16 or higher as well. – mosc9575 May 22 '23 at 11:09