I am trying to visualize sequences of events by using Sankey diagrams. I have a set of event (Event1 to Event16) over sequences of different length. The steps of the sequences are noted by T0, T0 - 1, T0 - 2 ... The width of the flow is corresponding to the frequency rate of the sequences.
I would like that all the nodes corresponding to a given step to be aligned vertically.
By using the GoogleVis package I succeed to obtain the following :
As you can see some events T0-1, T0-2 and T0-3... are on the far right, instead of with the others of their time step. It seems to be due to the fact that it is not possible to have nodes whithout children...
Do you know a way to have hierarchize nodes or/and nodes whithout children, for GoogleVis ? If not, do you know another R package which could allow to have these characteristics for interactive plots ?
My R code is bellow. The main variable containing the sequences is a list of list, see picture.
My code :
# Package
library(googleVis)
library(dplyr)
library(reshape2)
library(tidyverse)
# Load
load("SeqCh")
# Loop -------------------------------------------------------------
# Inits
From = c()
To = c()
Freq = c()
Target = SeqCh
# Get maximum length of sequence
maxls = 0
for (kk in 1:length(Target)){
temp = length(Target[[kk]])
if (temp > maxls){
maxls = temp
}
}
# Loop on length of sequences
for (zz in 2:maxls){
# Prefix to add to manage same event repeated
if (zz == 2){
SufixFrom = "(T0)"
SufixTo = "(T0 - 1)"
} else {
SufixFrom = paste("(T0 - ", as.character(zz-2), ")", sep = "")
SufixTo = paste("(T0 - ", as.character(zz-1), ")", sep = "")
}
# Message
cat("\n")
print(paste(" Processing events from ", SufixFrom, " to ", SufixTo))
# Loop on Target
ind = lapply(Target, function(x) length(x) == zz)
TargetSub = Target[unlist(ind)]
FreqSub = Support[unlist(ind)]
for (jj in 1:length(TargetSub)){
temp = TargetSub[[jj]]
TempFrom = paste(temp[zz-1], SufixFrom, sep = " ")
TempTo = paste(temp[zz], SufixTo, sep = " ")
From = c(From, TempFrom)
To = c(To, TempTo)
Freq = c(Freq, FreqSub[jj])
}
} # end for loop on length of sequences
# All in same variable
Flows = data.frame("From" = From, "To" = To, "Occurence_Frequency" = Freq, stringsAsFactors = FALSE)
# Plot --------------------------------------------------------------------
plot(gvisSankey(Flows, from='From', to='To', weight="Occurence_Frequency",
options=list(height=900, width=1800, sankey="{link:{color:{fill:'lightblue'}}}")))
Thanks, Romain.