3

I am running below code in RStudio, and want to create sankey chart with plotly. code runs without error. but the sankey chart is not displayed. what's wrong here?

library("plotly")
a = read.csv('cleanSankey.csv', header=TRUE, sep=',')
node_names <- unique(c(as.character(a$source), as.character(a$target)))
nodes <- data.frame(name = node_names)
links <- data.frame(source = match(a$source, node_names) - 1,
                    target = match(a$target, node_names) - 1,
                    value = a$value)

nodes_with_position <- data.frame(
  "id" = names,
  "label" = node_names,
  "x" = c(0, 0.1, 0.2, 0.3,0.4,0.5,0.6,0.7),
  "y" = c(0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7)
)

#Plot
plot_ly(type='sankey',
        orientation = "h",
        
        node = list(
          label = node_names,
          x = nodes_with_position$x,
          y = nodes_with_position$y,
          color = "grey",
          pad = 15,
          thinkness = 20,
          line = list(color = "grey", width = 0.5)),
 
         link = list(
           source = links$source,
           target = links$target,
           value = links$value))

sankey is plotted , but and nodes for second layer goes to last layer. How to fix the node position?

peace
  • 299
  • 2
  • 16

1 Answers1

0

You'll need to define the node position by setting the arrangement argument to stop plotly from automatically justifying the position.

This requires some finessing as you need to specify the coordinates of the nodes. You can find more details here: https://plotly.com/python/sankey-diagram/#define-node-position

Code

library(plotly)

a <- read.csv("~/cleanSankey.csv")

node_names <- unique(c(as.character(a$source), as.character(a$target)))

# added id column for clarity, but it's likely not needed
nodes <- data.frame(
  id = 0:(length(node_names)-1),
  name = node_names
)

links <- data.frame(
  source = match(a$source, node_names) - 1,
  target = match(a$target, node_names) - 1,
  value = a$value
)

# set the coordinates of the nodes
nodes$x <- c(0, 1, 0.5)
nodes$y <- c(0, 0.5, 1)

# plot - note the `arrangement="snap"` argument
plot_ly(
  type='sankey',
  orientation = "h",
  arrangement="snap", # can also change this to 'fixed'
  node = list(
    label = nodes$name,
    x = nodes$x,
    y = nodes$y,
    color = "grey",
    pad = 15,
    thinkness = 20,
    line = list(color = "grey", width = 0.5)
  ),
  link = list(
    source = links$source,
    target = links$target,
    value = links$value
  )
)

Plot output with arrangement="snap":

Sankey plot with arrangement set to "snap"

camraynor
  • 96
  • 1
  • 4
  • I think you just need to add `arrangement="snap"` to your `plot_ly` function. If that doesn't work, try `arrangement="fixed"` – camraynor Feb 10 '22 at 07:16
  • I have 8 layers/columns in sankey chart. I got error "Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class ‘"function"’ to a data.frame" based on your suggestion. code is update in post. – peace Feb 10 '22 at 07:17
  • tried add arrangement parameter, still failed. the error still is Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class ‘"function"’ to a data.frame – peace Feb 10 '22 at 07:28
  • Do you get the error with the example data I provided or a different data set? – camraynor Feb 10 '22 at 07:35
  • i am using my own dataset. I post some sample csv data in the post. – peace Feb 10 '22 at 07:37
  • got it, I'll see if I can reproduce with the csv sample – camraynor Feb 10 '22 at 07:39
  • The error is because `"id" = names,` is trying to assign the `names()` function as a column of the data.frame. It seems there are only three nodes but there are 8 coordinates. How many nodes are you expecting there to be? – camraynor Feb 10 '22 at 07:50
  • totally there're 28 nodes in my csv file, the max number of nodes in a layer/column is 18. – peace Feb 10 '22 at 07:52
  • I'll update my answer with new code – camraynor Feb 10 '22 at 07:56
  • I've updated my answer using the sample csv data – camraynor Feb 10 '22 at 08:05