3

I am working on making a sankey diagram in R but having trouble with the meaning behind the node names.

Below is an example that I found online:

library(networkD3)
nodes = data.frame("name" = 
                     c("Node A", # Node 0
                      "Node B", # Node 1
                      "Node C", # Node 2
                      "Node D"))# Node 3
links = as.data.frame(matrix(c(
  0, 1, 10, # Each row represents a link. The first number
  0, 2, 20, # represents the node being conntected from. 
  1, 3, 30, # the second number represents the node connected to.
  2, 3, 40),# The third number is the value of the node
  byrow = TRUE, ncol = 3))
names(links) = c("source", "target", "value")
sankeyNetwork(Links = links, Nodes = nodes,
              Source = "source", Target = "target",
              Value = "value", NodeID = "name",
             fontSize= 12, nodeWidth = 30) 

enter image description here

The dataset I am using is below:

source<-c('EASTERN PARKWAY', 'CONEY ISLAND AVENUE', 'ATLANTIC AVENUE', 'ATLANTIC AVENUE','ATLANTIC AVENUE','ATLANTIC AVENUE',
      'AVENUE P', 'BAY PARKWAY', 'BUFFALO AVENUE', 'FLATBUSH AVENUE', 'PROSPECT EXPRESSWAY', 'SAINT JOHNS PLACE',
      '6 AVENUE', '65 STREET', '65 STREET', '65 STREET', 'ATLANTIC AVENUE', 'ATLANTIC AVENUE', 'ATLANTIC AVENUE', 'CONEY ISLAND AVENUE')

target<-c('BUFFALO AVENUE', 'AVENUE J', 'CLASSON AVENUE', 'EASTERN PARKWAY', 'HICKS STREET', 'LOGAN STREET',
      'EAST 18 STREET', 'CROPSEY AVENUE', 'EASTERN PARKWAY', 'AVENUE V', 'CHURCH AVENUE', 'ROCHESTER AVENUE',
      'ATLANTIC AVENUE', '17 AVENUE', '18 AVENUE', 'BAY PARKWAY', 'NEVINS STREET', 'UTICA AVENUE', 'VANDERBILT AVENUE', 'AVENUE P')

value<-c(8,5,4,4,4,4,4,4,4,4,4,4,3,3,3,3,3,3,3,3)
df<-data.frame(source, target, value)
df

   source              target             value
 1 EASTERN PARKWAY     BUFFALO AVENUE      8.00
 2 CONEY ISLAND AVENUE AVENUE J            5.00
 3 ATLANTIC AVENUE     CLASSON AVENUE      4.00
 4 ATLANTIC AVENUE     EASTERN PARKWAY     4.00
 5 ATLANTIC AVENUE     HICKS STREET        4.00
 6 ATLANTIC AVENUE     LOGAN STREET        4.00
 7 AVENUE P            EAST 18 STREET      4.00
 8 BAY PARKWAY         CROPSEY AVENUE      4.00
 9 BUFFALO AVENUE      EASTERN PARKWAY     4.00
10 FLATBUSH AVENUE     AVENUE V            4.00
11 PROSPECT EXPRESSWAY CHURCH AVENUE       4.00
12 SAINT JOHNS PLACE   ROCHESTER AVENUE    4.00
13 6 AVENUE            ATLANTIC AVENUE     3.00
14 65 STREET           17 AVENUE           3.00
15 65 STREET           18 AVENUE           3.00
16 65 STREET           BAY PARKWAY         3.00
17 ATLANTIC AVENUE     NEVINS STREET       3.00
18 ATLANTIC AVENUE     UTICA AVENUE        3.00
19 ATLANTIC AVENUE     VANDERBILT AVENUE   3.00
20 CONEY ISLAND AVENUE AVENUE P            3.00

Does someone know how to reproduce the sankey diagram above with this data? I can't seem to figure out how the nodes come into play. Any help would be great thanks!

CJ Yetman
  • 8,373
  • 2
  • 24
  • 56
nak5120
  • 4,089
  • 4
  • 35
  • 94

1 Answers1

5

The Nodes data frame defines all of the nodes that will be plotted, and the NodeID vector in the Nodes data frame contains the label that will be displayed for each node.

links <- read.csv(text = "
source,target,value
EASTERN PARKWAY,BUFFALO AVENUE,8.00
CONEY ISLAND AVENUE,AVENUE J,5.00
ATLANTIC AVENUE,CLASSON AVENUE,4.00
ATLANTIC AVENUE,EASTERN PARKWAY,4.00
ATLANTIC AVENUE,HICKS STREET,4.00
ATLANTIC AVENUE,LOGAN STREET,4.00
AVENUE P,EAST 18 STREET,4.00
BAY PARKWAY,CROPSEY AVENUE,4.00
BUFFALO AVENUE,EASTERN PARKWAY,4.00
FLATBUSH AVENUE,AVENUE V,4.00
PROSPECT EXPRESSWAY,CHURCH AVENUE,4.00
SAINT JOHNS PLACE,ROCHESTER AVENUE,4.00
6 AVENUE,ATLANTIC AVENUE,3.00
65 STREET,17 AVENUE,3.00
65 STREET,18 AVENUE,3.00
65 STREET,BAY PARKWAY,3.00
ATLANTIC AVENUE,NEVINS STREET,3.00
ATLANTIC AVENUE,UTICA AVENUE,3.00
ATLANTIC AVENUE,VANDERBILT AVENUE,3.00
CONEY ISLAND AVENUE,AVENUE P,3.00
")

# build a nodes data frame using all unique names of nodes found in your links
# source *and* target vectors
nodes <- data.frame(name = unique(c(as.character(links$source), as.character(links$target))))

# set the source and target values in your links data frame to the index of the
# node that they refer to in the nodes data frame (0-indexed becauuse it's 
# used by JavaScript)
links$source <- match(links$source, nodes$name) - 1
links$target <- match(links$target, nodes$name) - 1

# plot it
library(networkD3)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source", 
              Target = "target", Value = "value", NodeID = "name")

enter image description here

CJ Yetman
  • 8,373
  • 2
  • 24
  • 56
  • this is amazing thank you. Mine looks alittle bit different from your code. How did you get the dotted line going back to eastern parkway. – nak5120 Jul 26 '18 at 13:01
  • mine is alittle more stretched out and is tough to see - https://drive.google.com/open?id=1wbI3SEhZ3OljVO3Gr5UTk6I6z-M817X3 – nak5120 Jul 26 '18 at 13:03
  • Looks like you have multiple nodes with the same name, so you must not have created the `nodes` data frame properly with **unique** nodes. – CJ Yetman Jul 26 '18 at 13:05
  • Ok, I copied and pasted the code you provided and that's what came up. Is your plot from different code? – nak5120 Jul 26 '18 at 13:07
  • nope, that's the exact code that made the plot I posted – CJ Yetman Jul 26 '18 at 13:10
  • ok yeah that's strange, I'm getting a different plot. Is there a zoom option to make it larger in R? – nak5120 Jul 26 '18 at 13:11
  • 1
    If you're looking at it in the "Viewer" panel in RStudio, resize the Viewer window and then click the refresh the window. If you're looking at it in a browser, resize the browser window and then refresh. – CJ Yetman Jul 26 '18 at 13:12
  • If you're not getting the same thing from copy-pasting the code, you should probably quit RStudio and re-open it to make sure something is not messed up with your environment. – CJ Yetman Jul 26 '18 at 13:16
  • last question, sorry. Do you know how to make the font size larger in sankey plots? – nak5120 Jul 26 '18 at 13:19
  • also didn't change when I reloaded R. It still works though when I put it in a new window! – nak5120 Jul 26 '18 at 13:19
  • 1
    try reading the help file... there are very clear directions on how to adjust the font size – CJ Yetman Jul 26 '18 at 13:22