1

I am using the networkD3 library for R in order to create Sankey networks. While this works pretty well for me, I have now encountered an issue with assigning the attributes "NoteID" and/or "NoteGroup" to group and allocate colors as shown in https://christophergandrud.github.io/networkD3/#sankey

The following code shows 4 examples of creating a Sankey diagram, only "Sankey4" works as designed, i.e., no colors:

library(networkD3)
#Unique list of nodes
my_nodes = structure(list(name = c("HawaiTEST", "AMSVOASMPP01", "App1", 
                                   "Transfer", "Transferred_tel__63null_",
                                   "Transferred_tel__631100107_", 
                                   "AMSVOASMPP02",
                                   "Transferred_tel__631100108_",
                                   "Transferred_tel__631100106_", 
                                   "Transferred_tel__631100104_",
                                   "Transferred_tel__631100105_", 
                                   "FarEndDisconnect",
                                   "FarEndDisconnect_Hangup", "DutchAOS",
                                   "SwedenAOS", 
                                   "Transferred_tel__63000_")), class =
c("tbl_df", "tbl", "data.frame"
  ), row.names = c(NA, -16L), .Names = "name")

# Network
my_links = structure(list(key = c("0_1", "0_6", "1_13", "1_14", "1_2", "11_12", 
                                  "13_11", "13_3", "14_11", "14_3", "2_11",
                                  "2_3", "3_10", "3_15", 
                                  "3_4", "3_5", "3_7", "3_8", "3_9", "6_13",
                                  "6_2"), source = c(0L, 
                                  0L, 1L, 1L, 1L, 11L, 13L, 13L, 14L, 14L, 2L,
                                  2L, 3L, 3L, 3L, 
                                  3L, 3L, 3L, 3L, 6L, 6L), target = c(1L, 6L,
                                  13L, 14L, 2L, 12L, 
                                  11L, 3L, 11L, 3L, 11L, 3L, 10L, 15L, 4L, 5L,
                                  7L, 8L, 9L, 13L, 
                                  2L), total = c(38L, 36L, 4L, 3L, 31L, 6L, 2L,
                                  5L, 1L, 2L, 3L, 
                                  61L, 11L, 1L, 12L, 11L, 11L, 11L, 11L, 3L,
                                  33L)), class = c("tbl_df", 
                          "tbl", "data.frame"), row.names = c(NA, -21L), .Names
= c("key", 
    "source", "target", "total"))


# NOT WORKING using  "NodeID", or "NodeGroup"
sankey1 = sankeyNetwork(Links =my_links, Nodes = my_nodes, Source =
                           "source", Target = "target", Value = "total", units =
                           "calls", NodeID = "name")

sankey2 = sankeyNetwork(Links =my_links, Nodes = my_nodes, Source =
                           "source", Target = "target", Value = "total", units =
                           "calls", NodeGroup = "name")
sankey2

# NOT WORKING using ColourScale (diagram is displayed, grey scale though)
ColourScale <- 'd3.scale.ordinal()
            .domain(["lions", "tigers"])
           .range(["#FF6900", "#694489"]);'
sankey3 = sankeyNetwork(Links =my_links, Nodes = my_nodes, Source =
                           "source", Target = "target", Value = "total", units =
                           "calls", colourScale = JS(ColourScale))
sankey3

# WORKING! 

sankey4 = sankeyNetwork(Links =my_links, Nodes = my_nodes, Source =
                           "source", Target = "target", Value = "total", units =
                           "calls")
sankey4

"Sankey1" tries using "NoteID" the way it is used at the example from the web referenced above, however, doing that results in the diagram not being displayed at all; the same effect for "Sankey2". "Sankey4" is displayed in grey regardless of the color scheme definition.

I have also looked at the html code produced by both, my R code producing "Sankey1" as well as the code used on https://christophergandrud.github.io/networkD3/#sankey. Obviously, there is a difference regarding the group:

HTML from "Sankey1":

"group":{"name":["HawaiTEST", ...
...
"options":{"NodeID":1,"NodeGroup":"name","LinkGroup":null,

HTML excerpt from the web example:

"group":["Agricultural 'waste'","Bio-conversion", ...
...
"options":{"NodeID":"name","NodeGroup":"name","LinkGroup":null

Changing the output html for "Sankey1" in order to reflect the output from the web example solves the issue, "Sankey1" is displayed using the default color schema.

I am hitting a wall at the moment trying to understand the behavior for the data I am using. The sankey function does not rely on a mandatory list input; I have actually alos split the example data set from the website into two data frames (nodes, links), this does produce the same sankey diagram with colors as in the web example. Hence, something must be wrong with the input data in my example ... I guess ... Any help would be highly appreciated! Thanks Oli

Oliver
  • 441
  • 6
  • 14

2 Answers2

7

Perhaps, I am misunderstanding, but using either the CRAN or Github version, sankey1 produces the following for me with nodes colored as expected by their name.

sankey diagram 1 with colors

If we want to use NodeGroup, we could do something like this.

# make up a group based on the first two characters
#  of node name
my_nodes$group <- substr(my_nodes$name,1,2)
# now use our new group for group colors
sankeyNetwork(
  Links =my_links, Nodes = my_nodes, Source = "source",
  Target = "target", Value = "total", NodeID = "name",
  units = "calls",
  NodeGroup = "group"
)

sankey with grouped colors

If we wanted to supply a custom color scale, we could do this.

sankeyNetwork(
  Links =my_links, Nodes = my_nodes, Source = "source",
  Target = "target", Value = "total", NodeID = "name",
  units = "calls",
  NodeGroup = "group",
  colourScale = "d3.scale.category10()"
)

sankey with custom color scale

For custom assignment of colors, we could extend the previous example and hack off of the d3.scale.category* functions.

sankeyNetwork(
  Links =my_links, Nodes = my_nodes, Source = "source",
  Target = "target", Value = "total", NodeID = "name",
  units = "calls",
  NodeGroup = "group",
  colourScale = sprintf(
    "d3.scale.category10().range(%s).domain(%s)",
    jsonlite::toJSON(substr(topo.colors(length(unique(my_nodes$group))),1,7)),
    jsonlite::toJSON(unique(my_nodes$group))
  )
)

sankey with custom group colors

timelyportfolio
  • 6,479
  • 30
  • 33
  • Thank you! The issue on my end obviously was due to a caching setting of my internet connection. After having produced the sankey diagram I called "Sankey4" in the posting, the browser did not grab any of the modified versions ("Sankey1/2/3"), bugger. Yes, everything is working now. Thanks to your success report I have tried it on a different computer - something which I should have done before in the first place. Once again, many thanks indeed for looking at this problem. – Oliver Aug 04 '16 at 12:51
  • I have found something interesting. Obviously, my issue was not only caused by a caching setting after all. – Oliver Aug 08 '16 at 14:02
  • I have found something interesting. Obviously, my issue was not only caused by a caching setting after all. Everything works as described in your answer. However, as soon as I load the libraries "dplyr" and/or "tidyr" before calling the networkD3 function to create the network, the diagram "sankey1" produces the wrong HTML code as indicated above (""group":{"name":["HawaiTEST", ..."), which in the end results in an empty diagram. – Oliver Aug 08 '16 at 14:09
  • are you using the same code, or have you changed to use `dplyr` and `tidyr`? – timelyportfolio Aug 09 '16 at 19:27
0

I had a similar issue. I resolved it by decreasing the overall number of nodes (by filtering only edges above a certain value).

lara
  • 176
  • 2
  • 6