I want to visualize category membership and looking to do it with a go.sankey diagram but the examples I have found are not very clear.
My dataframe looks like this:
label1 | count1 | length1 | label2 | count2 | lenght2 | label3 | count3 | length3 | label4 | count4 | length4 | etc... |
---|---|---|---|---|---|---|---|---|---|---|---|---|
cat1 | 600 | 5 | cat3 | 1200 | 20 | no_change | ||||||
cat1 | 400 | 20 | cat5 | 1000 | 10 | cat1 | 2000 | 8 | no_change | |||
cat3 | 200 | 12 | no_change | |||||||||
cat8 | 800 | 17 | cat1 | 890 | 32 | cat15 | 100 | 4 | cat3 | 1230 | 20 | no_change |
where the table reaches up to label8, each label has around 15 unique values and these values are not unique to each label. I understand that the number of unique values might ruin the visibility of the diagram and I might have to collapse them into fewer categories.
I have read the plotly example and some more but my confusion mainly lies in the re-use of the labels since I need to enumerate each column differently so that the diagram won't loop? What would be the optimal way of doing that?
Additionally, is there a way to position each node according to its mean length or simply display that without the need to hover over it?