I am trying to visualize my data via a sankey diagram.
I have the following dataframe:
sankey1 <- structure(list(pat_id = c(10037, 10264, 10302, 10302, 10302,
10344, 10482, 10482, 10482, 10613, 10613, 10613, 10628, 10851,
11052, 11203, 11214, 11214, 11566, 11684, 11821, 11945, 11945,
11952, 11952, 12122, 12183, 12774, 13391, 13573, 13643, 14298,
14556, 14556, 14648, 14862, 14935, 14935, 14999, 15514, 15811,
16045, 16045, 16190, 16190, 16190, 16220, 16220, 16220, 16220
), contactnummer = c(1, 1, 1, 2, 3, 1, 1, 2, 3, 1, 2, 3, 1, 1,
1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1,
1, 1, 2, 1, 1, 1, 1, 2, 1, 2, 3, 1, 2, 3, 99), Combo2 = c(1,
1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1,
2, 4, 4, 1, 5, 1, 1, 1, 1, 3, 3, 1, 5, 1, 1, 3, 1, 1, 1, 1, 1,
3, 6, 3, 1, 1, 1, 1), treatment = c(99, 0, 0, 1, 1, 0, 99, 99,
99, 99, 99, 1, 1, 0, 1, 99, 99, 99, 0, 99, 99, 0, 0, 0, 1, 99,
99, 0, 0, 0, 0, 0, 1, 1, 1, 99, 99, 1, 0, 0, 1, 0, 0, 0, 1, 1,
99, 99, 99, 99)), row.names = c(NA, 50L), class = c("data.table",
"data.frame"))
An ID number ("pat_id") can have multiple rows, each row is a contact ("contactnummer") My aim is to visualize which combinations ("combo2") lead to which treatments ("treatment) and at what contact.
I hope to visualise this via a sankey diagram (https://r-graph-gallery.com/321-introduction-to-interactive-sankey-diagram-2.html).
Ideally the desired output would look similar like this:
Ideally i would like to have the combinations ("Combo2") as arrows, showed in different colours per unique combination. These arrows should then lead to a treatment. But then i would like them continue, so after contact 1 - if an ID number has a second contact, the arrow shows again what combinations after that treatment occurs and to what treatment it leads in the second contact.
I've tried using the following script, but without succes.
library(networkD3)
library(d3Network)
# Create a data frame for the Sankey diagram
sankey_data <- sankey %>%
group_by(pat_id, Combo2, treatment, contactnummer) %>%
summarise(Count = n()) %>%
mutate(Target = lead(treatment), Value = Count) %>%
filter(!is.na(Target))
# Create a list of unique nodes with color attributes
combo2_nodes <- unique(sankey_data$Combo2)
treatment_nodes <- unique(sankey_data$treatment)
nodes <- data.frame(
name = c(combo2_nodes, treatment_nodes),
color = c(rep("Combo2", length(combo2_nodes)), rep("Treatment", length(treatment_nodes)))
)
# Create a list of links
links <- data.frame(
source = match(sankey_data$Combo2, nodes$name) - 1,
target = match(sankey_data$Target, nodes$name) - 1,
value = sankey_data$Value
)
# Create the Sankey diagram with color attributes
sankey_plot <- sankeyNetwork(
Links = links,
Nodes = nodes,
Source = "source",
Target = "target",
Value = "value",
NodeID = "name",
units = "Count",
NodeGroup = "color" # Specify the color attribute
)
# Display the plot
sankey_plot
But this does not create it the I would like. I am very unexperienced with the sankey diagram. Any tips?