How to convert category data to Sankey plot in python?

Question

I have a data: for eg:

    Roll no.  1          2         3          4
0   30        Good       Fair      Excellent  Good  
1   31        Poor       Fair      Good       NaN
2   34        Excellent  Good      Poor       Fair
3   35        Good       Good      Fair       Good
4   36        NaN        Fair      Poor       Fair
5   37        Excellent  Good      Excellent  Excellent
6   39        Good       Good      Fair       Excellent
7   42        Good       Good      Fair       Fair
8   44        Fair       Good      Fair       Poor
9   45        Good       Good      Good       Good
10  46        Poor       Good      Fair       Fair
11  50        Excellent  Good      Good       Good

Student data with their performance across Test 1, Test 2, Test 3, Test 4 and I need to compare how they are performing over time. So for that I need to prepare the Sankey chart to check how many are getting converted from one Category to another.

I am very much familiar with data visualization and I really clueless about this chart. I don't know how to source and destination, nodes, etc.

Please help.

I have tried for test 1 and test 2:

But I didn't work:

data_cross = pd.crosstab(data_pivot[1], data_pivot[2])
data_tidy = data_cross.rename_axis(None, axis=1).reset_index().copy()

formatted_data = pd.melt(data_tidy,
                         [1],
                         var_name='2',
                         value_name='Value')

formatted_data


import plotly.graph_objects as go

fig = go.Figure(data=[go.Sankey(
    node = dict(
      pad = 15,
      thickness = 20,
      line = dict(color = "black", width = 0.5),
      label = ['Poor','Fair','Good','Excellent'], 
      color = ['blue', 'yellow', 'green', 'orange']
    ),
    link = dict(
        source = [0, 0], # indices correspond to labels...
        target = [1, 2],
        value = formatted_data['Value'],
        color = ['lightblue', 'lightyellow', 'lightgreen','lightorange']
  ))])

fig.update_layout(title_text="Basic Sankey Diagram", font_size=10)
fig.show()

I want something like this:

Instead of Wales, Scotland, England I want my categories i.e. poor, fair, good, excellent and also I have so many sources and destinations i.e Test 1, Test 2, 3 and 4.

@JohanC I have the above code, for two tests, but it didn't work in my favor — Pri D, Dec 02 '21 at 06:55
your current code is `plotly` but you've only tagged `matplotlib`. if `plotly` solutions are acceptable, you should edit the tags so `plotly` users will also notice your post — tdy, Dec 02 '21 at 07:24

score 0 · Answer 1 · answered Oct 05 '22 at 18:29

A Sankey chart can be created using the D3Blocks library.

Install first:

pip install d3blocks


# Load d3blocks
from d3blocks import D3Blocks

# Initialize
d3 = D3Blocks()

# Load example data
# df = d3.import_example('energy')

# In your case you need to transform the data in the right shape which you can do with:
df = d3.adjmat2vec(df)

# Plot
d3.sankey(df, filepath='sankey.html')

How to convert category data to Sankey plot in python?

1 Answers1