0

I want to improve the appearance of my highcharter sankey diagram.

The labels of the nodes should be left and right to the plot and not "on" the nodes. So in the plot shown below the country names should appear left and right to the plot.

Additionally I am failing to add a thousand separator "," to the labels showing up when hovering over links and nodes.

This is the code I use for the sankey plot:

highchart() %>%
    hc_add_series(
    type = "sankey",
    data = df_list_spaces,
    nodes = nodes_list,
    links = df_list_spaces,
    name = "assets",
    dataLabels = list(
      enabled = TRUE
      ))

And this is how it looks:

enter image description here

To add a thousand separator I tried this:

# change thousands sep
hcoptslang <- getOption("highcharter.lang")
hcoptslang$thousandsSep <- ","
options(highcharter.lang = hcoptslang)

This works for other highcharter objects but strangely not for the "sankey" chart.

To reproduce the plot I also add the code with all the data transformation needed. There are probably nicer ways to end up with the final data for the plot but this is how I did it:

library(tidyverse)
library(highcharter)

df <- tibble(from = c("Brazil", "Brazil", "Poland", "Spain", "Poland"),
             toto = c("Portugal", "Spain", "Portugal", "Brazil", "Poland"),
             assets = c(5324523.4523, 232452345.234, 2345234.52345, 3234523.45234, 63452345.2345))


# bring rows into lists
# add space at the end for values of "to" 

transform_to_list <- function(data, from, to, weight, add_spaces = FALSE) {
  
  df_list <- list()
  for (i in 1:nrow(data)) {
    
    df_list[[i]] <- list(from = data[[from]][i],
                  # add spaces, so there is a difference between "from" and "to" countries
                     to = ifelse(test = add_spaces == TRUE, 
                                 yes = paste0(data[[to]][i], " "), 
                                 no = data[[to]][i]  
                                 ),
                     weight = data[[weight]][i]
                  )
  }
  
  return(df_list)
}

# select the data set, from, to and weight column; add_spaces = TRUE is needed for Sankey diagramms,
# for other things, for example "dependency_wheels" add_spaces should be FALSE

df_list_spaces <- transform_to_list(data = df, from = "from", "toto", weight = "assets", add_spaces = TRUE)

#--------------------------------------

# next step is to attach colors to the countries, but the problem is that we had to add spaces to the "to"
# countries and hence we have to find a way to attach the same color for example for "Brazil" and "Brazil ".


# We do this with a little function:

# inputs: data, from column, to column and a colorpallete
transform_to_nodes_list <- function(data, from, to, colors) {
  
  # create a vector of unique values from the "from" column
  from_col <- unique(data[[from]])
  
  # create a vector of values from the "to" column, with a space added at the end
  to_col <- unique(paste0(data[[to]], " "))
  
  # combine the "from" and "to" columns into a single vector
  id_col <- c(from_col, to_col)
  
  # create a "helper" column that ignores spaces at the end of values in the "id" column
  helper_col <- gsub("\\s+$", "", id_col) 
  
  # match colors based on helper column, first create factors, 
  # then change class to numeric so first factor gets first color of the color palette, second factor second color..   
  color_col <- colors[as.numeric(factor(helper_col))]
  
  # not necessary step
  d0 <- tibble(id_col, color_col)
  
  nodes_list <- list()
  for (i in 1:nrow(d0)) {
    
    # from <- deparse(substitute(from))
    # to <- deparse(substitute(to))
    # weight <- deparse(substitute(weight))
    
    nodes_list[[i]] <- list(id = d0[["id_col"]][i],
                            color = d0[["color_col"]][i])
  }
  
  
  return(nodes_list)
}

#  color palette:
pal <- c("brown", "blue", "yellow", "green", "orange", "red")


# call the functions:
df_list_spaces <- transform_to_list(data = df, from = "from", "toto", weight = "assets", add_spaces = TRUE)

nodes_list <- transform_to_nodes_list(data = df, from = "from", to = "toto", colors = pal)

# and finally plot
highchart() %>%
    hc_add_series(
    type = "sankey",
    data = df_list_spaces,
    nodes = nodes_list,
    links = df_list_spaces,
    name = "assets",
    dataLabels = list(
      enabled = TRUE
      ))
Marie Veit
  • 27
  • 5

1 Answers1

0

What about:

library(purrr) ## convenient conversion of dataframe rows to lists

highchart() %>%
  hc_add_series(
    type = "sankey",
    data = df |> pmap(~ list(from = ..1, to = paste0(..2, ' '), weight = ..3))
  ) |>
  hc_tooltip(formatter = JS("function () { return '<b>' +
                             this.point.from + ' -> ' +
                             this.point.to + '</b><br /> ' +
                             Highcharts.numberFormat(this.point.weight)
                            }"
                            )
             )

Formatting of tooltips (thousands separator) adopted from here: https://stackoverflow.com/a/42885998/20513099

Haven't found out yet how to align the node labels though.

I_O
  • 4,983
  • 2
  • 2
  • 15
  • Thanks a lot! Unfortunately I am not able to run your code, but the reason for this is maybe an oversight on my side.. I end up with a R console hang.. Seems like R is stuck in the JS function, but I am not familiar with this so I dont find a way out.. – Marie Veit Jul 10 '23 at 00:18
  • You're welcome. Remember to use `df |> pmap(~ list(from = ..1, to = paste0(..2, ' '), weight = ..3)` as highchart data (without `node_list` etc.), because the `pmap` expression creates row-wise lists of (from, to, weight) as required. – I_O Jul 10 '23 at 05:11