9

Using the dev version of the ggforce package, I can create a Sankey diagram as follows (from the documentation)

data <- reshape2::melt(Titanic)
data <- gather_set_data(data, 1:4)

ggplot(data, aes(x, id = id, split = y, value = value)) +
  geom_parallel_sets(aes(fill = Sex), alpha = 0.3, axis.width = 0.1) +
  geom_parallel_sets_axes(axis.width = 0.1) +
  geom_parallel_sets_labels(colour = 'white')

enter image description here

What I'm struggling with, is getting the y-axis variables ordered in any way other than the default, which appears to be reverse alphabetical. For example, changing the plot so Adult appeared near the top of the plot, with Child below.

I've tried re-leveling the factors before applying gather_set_data, as well as re-leveling the y variable after applying gather_set_data, and neither appear to work. I've also tried defining them as characters and sorting in different orders but that also doesn't seem to work.

Any help would be appreciated.

Daniel Anderson
  • 2,394
  • 13
  • 26
  • 1
    Any chance you found a solution? – ssp3nc3r Mar 15 '19 at 03:43
  • No, unfortunately I didn't. But this was a while ago and I know development has resumed so maybe this will be fixed soon (last commit I see was 23 hours ago as of this writing). – Daniel Anderson Mar 15 '19 at 18:38
  • Related [GitHub issue by @ssp3nc3r](https://github.com/thomasp85/ggforce/issues/136) – zx8754 Mar 16 '19 at 20:44
  • 1
    In my limited use case with only two labels on the *x* that have the same *y* categories, I managed to get it done using `ggplot(data, aes(x, id = id, split = factor(y, levels = c('A', 'B')) ...`. @ssp3nc3r – Paul Lemmens Feb 03 '20 at 19:38

2 Answers2

3

Unsure what you would do with ggforce as I don't use this package. I assumed the solution would be to re-level the factors as you mentioned but this doesn't seem to be working for you. However, this does work with ggalluvial. Furthermore, there is an argument reverse that allows you to reverse the order (alphabetical/reverse alphabetical). See below:

Default ordering

library(ggplot2)
library(ggalluvial)

df <- as.data.frame(Titanic)

ggplot(as.data.frame(df),
       aes(weight = Freq,
           axis1 = Survived, axis2 = Sex, axis3 = Class)) +
  geom_alluvium(aes(fill = Class),
                width = 0, knot.pos = 1/4, reverse = FALSE) +
  guides(fill = FALSE) +
  geom_stratum(width = 1/8, reverse = FALSE) +
  geom_text(stat = "stratum", label.strata = TRUE, reverse = FALSE) +
  scale_x_continuous(breaks = 1:3, labels = c("Survived", "Sex", "Class")) +
  ggtitle("Titanic survival by class and sex")

enter image description here

Reverse ordering

ggplot(as.data.frame(df),
       aes(weight = Freq,
           axis1 = Survived, axis2 = Sex, axis3 = Class)) +
  geom_alluvium(aes(fill = Class),
                width = 0, knot.pos = 1/4, reverse = TRUE) +
  guides(fill = FALSE) +
  geom_stratum(width = 1/8, reverse = TRUE) +
  geom_text(stat = "stratum", label.strata = TRUE, reverse = TRUE) +
  scale_x_continuous(breaks = 1:3, labels = c("Survived", "Sex", "Class")) +
  ggtitle("Titanic survival by class and sex")

enter image description here

Re-leveling factor

df$Class <- factor(df$Class, levels = c("3rd", "1st", "Crew", "2nd"))

ggplot(as.data.frame(df),
       aes(weight = Freq,
           axis1 = Survived, axis2 = Sex, axis3 = Class)) +
  geom_alluvium(aes(fill = Class),
                width = 0, knot.pos = 1/4, reverse = FALSE) +
  guides(fill = FALSE) +
  geom_stratum(width = 1/8, reverse = FALSE) +
  geom_text(stat = "stratum", label.strata = TRUE, reverse = FALSE) +
  scale_x_continuous(breaks = 1:3, labels = c("Survived", "Sex", "Class")) +
  ggtitle("Titanic survival by class and sex")

enter image description here

tyluRp
  • 4,678
  • 2
  • 17
  • 36
  • Thanks! That's a nice solution. I'll try it out and see if I can make it work. I was hoping for for a ggforce solution since that's what I'm using already, but this may work too. – Daniel Anderson Dec 12 '17 at 00:47
  • Hmm... so occasionally the flow doesn't seem to match directly with the stratum - for example, the pink/orange line from female to first class in the first figure overlaps with 2nd class a bit when connecting to the class stratum. This matters a lot in my actual application because small lines look to be coming from the wrong category. Any way to fix that? – Daniel Anderson Dec 12 '17 at 01:16
  • Hm, try adjusting the `width` in `geom_alluvium` to `width = 0.1`. This seems to reduce the overlap. – tyluRp Dec 12 '17 at 01:22
  • 1
    It would be nice if we could add spaces in-between the strata but it seems that `ggalluvial` is not designed to do this. ["No gaps are inserted between the strata, so the total height of the diagram reflects the cumulative weight of the observations."](https://cran.r-project.org/web/packages/ggalluvial/vignettes/ggalluvial.html). Might be a feature at a later time? ["The option should be available to impose spacing between the strata within each axis."](https://libraries.io/github/corybrunson/ggalluvial) I'm heading to work but I will try an figure something out when I get back. – tyluRp Dec 12 '17 at 01:33
1

How about changing y variable into factor as follows?

    titanic <- reshape2::melt(Titanic)
    titanic <- gather_set_data(titanic, 1:4)
    titanic$y <- factor(titanic$y, levels=c("Adult", "Child", "1st", "2nd", "3rd", "Crew", "Male", "Female", "Yes", "No"))
    ggplot(titanic, aes(x, id = id, split = y, value = value)) +
        geom_parallel_sets(aes(fill = Sex), alpha = 0.3, axis.width = 0.1) +
        geom_parallel_sets_axes(axis.width = 0.1) +
        geom_parallel_sets_labels(colour = 'white')
Jozef Asai
  • 11
  • 1