2

I want to keep the second axis of an alluvial plot in the same order as the first axis. Namely as the first axis is a higher-level taxonomy of the second, its main purpose is to show an overview and grouping of the organisms to enhance readability of the graph. For this, I tried to manually order the strata. All that I achieved, however, was reshuffling the lodes instead of the strata (e.g. from this tutorial or playing with the lode.guidance).

Does someone have an idea how to solve this? In the end, all the lodes between the first and the second axis should flow horizontally and sort then from the second into the third as it is.

A short version of the data (still quite extensive, sorry):

taxa <- structure(list(Order = structure(c(6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
                                           6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
                                           6L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
                                           9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
                                           9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
                                           9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
                                           9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
                                           9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
                                           9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
                                           9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
                                           9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
                                           9L, 9L, 19L, 19L, 19L, 19L, 19L, 19L,
                                           19L, 19L, 19L, 19L, 19L, 19L, 19L,
                                           19L, 27L, 27L, 27L, 27L, 27L, 27L,
                                           27L, 27L, 27L, 27L, 27L, 27L, 27L,
                                           32L, 32L, 32L, 32L, 32L, 32L, 32L,
                                           32L, 32L, 32L, 32L, 32L, 32L),
                                         .Label = c("Adinetida", "Cercomonadidae_or",
                                                    "Cercozoa_unclassified",
                                                    "Chaetopeltidales",
                                                    "Chlorophyta_ph_unclassified",
                                                    "Chromulinales",
                                                    "Chrysophyceae_unclassified",
                                                    "Chytridiomycetes_unclassified",
                                                    "Conthreep", "Craspedida_or",
                                                    "Cryomonadida", "Cryptomonadales",
                                                    "Cystobasidiales",
                                                    "Cystobasidiomycetes_unclassified",
                                                    "Cystofilobasidiales",
                                                    "Dinophyceae_unclassified",
                                                    "Diplogasterida",
                                                    "Glissomonadida_or",
                                                    "Helotiales", "Imbricatea_unclassified",
                                                    "Incertae_Sedis",
                                                    "Intramacronucleata_unclassified",
                                                    "Leotiomycetes_unclassified", "LG08-10_or",
                                                    "Litostomatea", "Monhysterida",
                                                    "Ochromonadales",
                                                    "Ochrophyta_ph_unclassified",
                                                    "Parachela",
                                                    "Peronosporomycetes_or",
                                                    "Phragmoplastophyta_unclassified",
                                                    "Saccharomycetales",
                                                    "Spirotrichea", "Spongomonadida",
                                                    "Thecofilosea_unclassified",
                                                    "Tremellales", "Tremellomycetes_or",
                                                    "Tremellomycetes_unclassified",
                                                    "Trichosporonales"),
                                         class = "factor"),
              Genus = c(paste(rep("Poterioochromonas", 19)),
                        paste(rep("Colpoda", 9)),
                        paste(rep("Colpodea_unclassified", 24)),
                        paste(rep("Colpodida_ge", 28)),
                        paste(rep("Conthreep_unclassified", 4)),
                        "Cryptocaryon", "Cyclidium",
                        paste(rep("Nassophorea_unclassified", 2)),
                        "Platyophrya", paste(rep("Tetrahymena", 5)),
                        paste(rep("uncultured", 3)),
                        paste(rep("uncultured_ge", 4)), "Glarea",
                        paste(rep("Helotiales_unclassified_ge", 13)),
                        paste(rep("Chrysolepidomonas", 12)), "Ochromonas",
                        paste(rep("Debaryomycetaceae_unclassified", 3)),
                        "Pichiaceae_unclassified_ge",
                        paste(rep("Saccharomycetaceae_unclassified", 5)),
                        paste(rep("Yarrowia", 4))),
              Freq = rep(1, 141),
              Habitat = c("B", "B", "B", "B", "B", "B", "B","B", "B", "B", "B",
                          "B", "B", "B", "B", "B", "B", "B", "B", "A", "A", "A",
                          "B", "A", "A", "B", "B", "B", "B", "A", "A", "A", "A",
                          "B", "A", "A", "A", "A", "B", "A", "B", "B", "A", "B",
                          "B", "B", "A", "A", "A", "B", "B", "B", "A", "A", "A",
                          "A", "A", "A", "A", "A", "B", "A", "B", "A", "B", "B",
                          "A", "A", "B", "B", "A", "B", "A", "B", "B", "A", "A",
                          "B", "B", "B", "B", "B", "B", "B", "B", "B", "A", "A",
                          "B", "A", "A", "B", "B", "B", "A", "B", "B", "A", "B",
                          "B", "B", "A", "A", "A", "A", "A", "A", "A", "A", "B",
                          "A", "B", "A", "B", "A", "A", "A", "B", "B", "B", "B",
                          "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B",
                          "A", "A", "B", "A", "B", "A", "B", "B", "B")),
              class = "data.frame", row.names = seq(1:141))

And here the alluvial diagram that comes out with the three axes sorted alphabetically:

library("ggalluvial")

ggplot(data = taxa,
       aes(axis1 = Order, axis2 = Genus, axis3 = Habitat, y = Freq)) +
  stat_alluvium(aes(fill = Habitat)) +
  geom_stratum(linetype = 1, lwd = 0.01) +
  geom_text(stat = "stratum", infer.label = TRUE, size = 3) +
  theme_void() +
  theme(legend.position = "none")

Alluvial diagram, first two axes sorted alphabetically

bathyscapher
  • 1,615
  • 1
  • 13
  • 18
  • Have you tried to put `Genus` and `Order` `as.factor`, and reorder the levels? Furthermore IDK but I do not have the option `infer.label` in `geom_text()`, that's weird also because it in the [GGAlluvial doc](https://cran.r-project.org/web/packages/ggalluvial/vignettes/ggalluvial.html), but it seems that there's not in the [geom_text() doc](https://ggplot2.tidyverse.org/reference/geom_text.html). – s__ May 11 '20 at 13:38
  • Hmm sounds like a good idea. Sadly, it seems like it doesn't solve it, rather it behaves quite unexpectedly. After converting them both to factors and sorting `Order`, the lodes are now horizontally, they mismatch with taxonomy and only one of the labels remains and is shifted... No idea either, how `ggalluvial` implements `infer.label` into `geom_text()`. – bathyscapher May 11 '20 at 17:15
  • 1
    @s_t your suggestion was super close (see the answer)! – bathyscapher May 21 '20 at 13:00

1 Answers1

1

As @s_t suggested, this should be resolvable by making taxa$Genus a factor variable rather than a character variable. But as.factor() puts the factor levels in alphabetical order, which is not what you're after. By specifying the levels in order of their appearance in taxa, the following code reorders the strata in the second axis while leaving the other axes as they were. Is the resulting plot what you're after? (Code copied from your question is omitted.)

# ensure that 'Genus' is a factor with levels in order of appearance
taxa$Genus <- factor(taxa$Genus, levels = as.character(unique(taxa$Genus)))
# plot
ggplot(data = taxa,
       aes(axis1 = Order, axis2 = Genus, axis3 = Habitat, y = Freq)) +
  stat_alluvium(aes(fill = Habitat)) +
  geom_stratum(linetype = 1, lwd = 0.01) +
  geom_text(stat = "stratum", infer.label = TRUE, size = 3) +
  theme_void() +
  theme(legend.position = "none")

Created on 2020-05-16 by the reprex package (v0.3.0)

Cory Brunson
  • 668
  • 4
  • 10