1

I am using R and ggplot to generate boxplots with overlay of individual datapoints. There are two major groups on the x axis (var1 in sampleData). The second group also has a minor division (var2) which I would like to indicate by point fill. I have successfully done this using the new_scale_fill function from the ggnewscale package.

However, when I do this it seems to alter the jitter such that the minor division points (var2) all group together on the left and right of the boxplot. Is there a way to avoid this and have all of the points group randomly within var1 group?

Example data and graph output is below.

sampleData <- data.frame(numVal = c(rnorm(100)),
                         var1 = c(rep(c("x1", "x2"), each = 50)),
                         var2 = c(rep(c("x1","x2","x3"), times = c(50, 30, 20))),
                         var3 = c(rep(c("t1", "t2", "t3", "t4"), 25)))

ggplot(sampleData, aes(x = var1, y = numVal)) +
  geom_boxplot(aes(fill = var1), alpha = 0.7) +
  scale_fill_manual(values = c("red", "blue")) +
  new_scale_fill() +
  geom_point(aes(fill = var2), shape = 21,
             position = position_jitterdodge()) +
  scale_fill_manual(values = c("red", "blue", "white")) 
  theme(legend.position = "none")


Example output

mikeHoncho
  • 317
  • 2
  • 11

1 Answers1

2

You may try the argument dodge.width inside of the position_jitterdodge function. Setting dodge.width to zero will mix the blue and white points.

ggplot(sampleData, aes(x = var1, y = numVal)) +
  geom_boxplot(aes(fill = var1), alpha = 0.7) +
  scale_fill_manual(values = c("red", "blue")) +
  new_scale_fill() +
  geom_point(aes(fill = var2), shape = 21,
             position = position_jitterdodge(dodge.width = 0)) +
  scale_fill_manual(values = c("red", "blue", "white")) 

enter image description here

Edit: Third variable

If you put a third variable on the x-axis, you may need to controll the interaction between the axis and the fill arguments. Here is an idea:

ggplot(sampleData, aes(x = interaction(var1,var3), y = numVal)) +
  geom_boxplot(aes(fill = var1), alpha = 0.7) +
  scale_fill_manual(values = c("red", "blue")) +
  new_scale_fill() +
  geom_point(aes(fill = interaction(var1,var2)), shape = 21,
             position = position_jitterdodge(dodge.width = 0)) +
  scale_fill_manual(values = c("red", "blue", "white"),
                    labels = c("x1", "x2", "x3"),
                    name = "var2") +
  labs(x = "var3") +
  facet_grid(.~ var3, scale = "free_x", switch = "x") +
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank(),
        panel.spacing.x = unit(0, "lines"),
        strip.background = element_rect(fill = "transparent"))

enter image description here

tamtam
  • 3,541
  • 1
  • 7
  • 21
  • This is a good answer, thank you. Although it works in this instance, it does not work in a more complicated instance, for example if you replace the x axis variable with the new var3 so you have multiple groupings of the two variables across the x axis. – mikeHoncho Dec 15 '20 at 10:02