2

I am trying to visualize sequences of events by using Sankey diagrams. I have a set of event (Event1 to Event16) over sequences of different length. The steps of the sequences are noted by T0, T0 - 1, T0 - 2 ... The width of the flow is corresponding to the frequency rate of the sequences.

I would like that all the nodes corresponding to a given step to be aligned vertically.

By using the GoogleVis package I succeed to obtain the following :

Sankey with GoogleVis

As you can see some events T0-1, T0-2 and T0-3... are on the far right, instead of with the others of their time step. It seems to be due to the fact that it is not possible to have nodes whithout children...

Do you know a way to have hierarchize nodes or/and nodes whithout children, for GoogleVis ? If not, do you know another R package which could allow to have these characteristics for interactive plots ?

My R code is bellow. The main variable containing the sequences is a list of list, see picture.

Data containing sequences

My code :

# Package

library(googleVis)
library(dplyr)
library(reshape2) 
library(tidyverse)

# Load 

load("SeqCh")

# Loop -------------------------------------------------------------

# Inits 

From = c()
To = c() 
Freq = c()
Target = SeqCh


# Get maximum length of sequence 

maxls = 0 

for (kk in 1:length(Target)){



 temp = length(Target[[kk]]) 

  if (temp > maxls){

    maxls = temp 

  }

}

    # Loop on length of sequences 

    for (zz in 2:maxls){

      # Prefix to add to manage same event repeated 

      if (zz == 2){

        SufixFrom = "(T0)"
        SufixTo = "(T0 - 1)"

      } else {

        SufixFrom = paste("(T0 - ", as.character(zz-2), ")", sep = "") 
        SufixTo = paste("(T0 - ", as.character(zz-1), ")", sep = "") 

      }

      # Message 

      cat("\n")
      print(paste(" Processing events from ", SufixFrom, " to ", SufixTo))

      # Loop on Target 

      ind = lapply(Target, function(x) length(x) == zz)
      TargetSub = Target[unlist(ind)]
      FreqSub = Support[unlist(ind)]

      for (jj in 1:length(TargetSub)){

        temp = TargetSub[[jj]]
        TempFrom = paste(temp[zz-1], SufixFrom, sep = " ")
        TempTo = paste(temp[zz], SufixTo, sep = " ")
        From = c(From, TempFrom)
        To = c(To, TempTo)
        Freq = c(Freq, FreqSub[jj])

      }

    } # end for loop on length of sequences

    # All in same variable

    Flows = data.frame("From" = From, "To" = To, "Occurence_Frequency" = Freq, stringsAsFactors = FALSE)

    # Plot --------------------------------------------------------------------

    plot(gvisSankey(Flows, from='From', to='To', weight="Occurence_Frequency",
                    options=list(height=900, width=1800, sankey="{link:{color:{fill:'lightblue'}}}")))

Thanks, Romain.

rom9569
  • 21
  • 3
  • Did you manage to solve this issue? Recently I faced with the same problem and have no idea how to solve it. – iomedee Mar 11 '20 at 10:59

0 Answers0