I have a data: for eg:
Roll no. 1 2 3 4
0 30 Good Fair Excellent Good
1 31 Poor Fair Good NaN
2 34 Excellent Good Poor Fair
3 35 Good Good Fair Good
4 36 NaN Fair Poor Fair
5 37 Excellent Good Excellent Excellent
6 39 Good Good Fair Excellent
7 42 Good Good Fair Fair
8 44 Fair Good Fair Poor
9 45 Good Good Good Good
10 46 Poor Good Fair Fair
11 50 Excellent Good Good Good
Student data with their performance across Test 1, Test 2, Test 3, Test 4 and I need to compare how they are performing over time. So for that I need to prepare the Sankey chart to check how many are getting converted from one Category to another.
I am very much familiar with data visualization and I really clueless about this chart. I don't know how to source and destination, nodes, etc.
Please help.
I have tried for test 1 and test 2:
But I didn't work:
data_cross = pd.crosstab(data_pivot[1], data_pivot[2])
data_tidy = data_cross.rename_axis(None, axis=1).reset_index().copy()
formatted_data = pd.melt(data_tidy,
[1],
var_name='2',
value_name='Value')
formatted_data
import plotly.graph_objects as go
fig = go.Figure(data=[go.Sankey(
node = dict(
pad = 15,
thickness = 20,
line = dict(color = "black", width = 0.5),
label = ['Poor','Fair','Good','Excellent'],
color = ['blue', 'yellow', 'green', 'orange']
),
link = dict(
source = [0, 0], # indices correspond to labels...
target = [1, 2],
value = formatted_data['Value'],
color = ['lightblue', 'lightyellow', 'lightgreen','lightorange']
))])
fig.update_layout(title_text="Basic Sankey Diagram", font_size=10)
fig.show()
I want something like this:
Instead of Wales, Scotland, England I want my categories i.e. poor, fair, good, excellent and also I have so many sources and destinations i.e Test 1, Test 2, 3 and 4.