2

I'm trying to improve my skills in data visualization, and I got almost what I wanted. But at some point, I got stuck and couldn't go any further. Just to be aware, guys, I've done extensive research here to try to find my doubts, it helps me a lot.

Here is my data set:

https://app.box.com/s/pp5p5chgypn6ba33anotie7wlxvdu01v

Here is my code:

library(tidyverse)
library(ggalluvial)
library(alluvial)

A_col <- "firebrick3"
B_col <- "darkorange"
C_col <- "aquamarine2"
D_col <- "dodgerblue2"
E_col <- "darkviolet"
F_col <- "chartreuse2"
G_col <- "goldenrod1"
H_col <- "gray73"
set.seed(39)

ggplot(df,
       aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
  geom_alluvium(aes(fill = Positions, color = Positions), 
        width = 4/12, alpha = 0.5, knot.pos = 0.3) +
  geom_stratum(width = 4/12, color = "grey36") +
  geom_text(stat = "stratum", label.strata = TRUE) +
  scale_x_continuous(breaks = 1:3, 
       labels = c("Activity", "Category", "Positions/Movements"), expand = c(.01, .05)) +
  ylab("Time 24 hours") +
  scale_fill_manual(values  = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  scale_color_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  ggtitle("Physical Activity during the week and weekend") +
  theme_minimal() +
  theme(legend.position = "none", panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), axis.text.y = element_blank(), 
        axis.text.x = element_text(size = 12, face = "bold"))

# I also have this code that I run without pre-choosing the colours.
# I like this one because the flow diagram doesn't have any border.

ggplot(df,
       aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
  scale_x_discrete(limits = c("Activity", "Category", "Positions/Moviments"), 
       expand = c(.01, .05)) +
  ylab("Time 24 hours") +
  geom_alluvium(aes(fill = Positions), width = 4/12, alpha = 0.5, knot.pos = 0.3) +
  geom_stratum() + geom_text(stat = "stratum", label.strata = TRUE) +
  theme_minimal() +
  ggtitle("Physical Activity during the week and weekend") +
  theme(legend.position = "none", panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), axis.text.y = element_blank(), 
        axis.text.x = element_text(size = 12, face = "bold"))

Here is the visualization: enter image description here

There are three things I really couldn't do:

  1. Sort the Category with a clear view of the week and after the weekend, such as Working, Non Working, Sleep Week, Leisure and Sleep Weekend.

  2. Sort the Positions/Movements such as Sitting, Lying, Standing, Moving, Stairs, Walk Slow, Walk Fast and Running. Also, I would like to fill the squares of this column with the same colour of the flow diagram. Another thing is that some names don't have enough space, I don't know if it's possible to reset the space to accommodate them, or maybe put them outside with an arrow indicating the square that belongs to them. Almost forgot, is there any way to manually assign the colours to each variable, such as colour black for Walk Slow? Plus, if it's possible I would like to take out the lines from the edges of the flow diagram.

  3. Is there a way to stack the names Position and Movements?

Any way to improve this visualization and make it beautiful?

Thanks in advance, Luiz

Luiz Brusaca
  • 55
  • 1
  • 8

2 Answers2

2

here's a solution that fixes some of your problems.

df <- read_csv('Desktop/plot_alluvial_category_position_plus_moviments.csv')
positions <- c("Sitting", "Lying", "Standing", "Moving", "Stairs", "Walk Slow",
               "Walk Fast", "Running")
df$Positions <- factor(df$Positions, levels = positions, labels = positions)
category <- c("Working", "Non Working", "Sleep Week", "Leisure", 
              "Sleep Weekend")
df$Category <- factor(df$Category, levels = category, labels = category)

ggplot(df,
       aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
  geom_alluvium(aes(fill = Positions), 
                width = 4/12, alpha = 0.5, knot.pos = 0.3) +
  geom_stratum(width = 4/12, color = "grey36") +
  geom_text(stat = "stratum", label.strata = TRUE, min.height=100) +
  scale_x_continuous(breaks = 1:3, 
                     labels = c("Activity", "Category", "Positions\nMovements"), expand = c(.01, .05)) +
  ylab("Time 24 hours") +
  scale_fill_manual(values  = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  scale_color_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  ggtitle("Physical activity during the week and weekend") +
  theme_minimal() +
  theme(legend.position = "none", panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), axis.text.y = element_blank(), 
        axis.text.x = element_text(size = 12, face = "bold"))
  1. To sort your strata, you need to transform your Category and Position columns to factors where you set the order of the levels.
  2. To remove the edges of the flow diagram, it's enough to remove color = Position from your aes level.
  3. You can stack the names Position and Movement by adding a newline character in the label.
  4. You can assign the colors to strata, but only if the categories are the same throughout (check some examples in the ggalluvial documentation).
  5. To avoid the overlap in the small strata, you can use min.height argument in geom_text that was introduced in ggalluvial version 0.9.2, as shown here.
Arienrhod
  • 2,451
  • 1
  • 11
  • 19
  • Hi @Arienrhod, thank you for your help. I installed the new `ggalluvial` package and `min.height` didn't work, I don't know why. When I run the code, the names disappear, so I deleted `min.height` and I got them back. I will try to find out what happened. – Luiz Brusaca Sep 02 '19 at 20:55
  • That's exactly what `min.height` does. It removes the labels that are too small too fit in the stratum. I'm not aware of another solution for the overlapping labels. – Arienrhod Sep 03 '19 at 09:05
  • Yes, you're right, but somehow it took all my labels off. I was reading `ggalluvial` and the best option for me is to use` ggrepel`. I will figure out how to use it in my code. – Luiz Brusaca Sep 03 '19 at 17:04
0

Very helpful, thank you for posting! I figured out a workaround for #4 in @Arienrhod answer (sorry, I cannot just comment due to low reputation). You can create a factor of the same length as the data and assign individual categories in proper order within the geom_stratum(aes(fill='your.factor'), width = 4/12, color = "grey36") and then use 'scale_fill_manual()' as shown above. It's a chore but it works.

  • This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/34723295) – Joe Jul 26 '23 at 19:37