21

Here is my code at the moment:

require(ggplot2)
value <- rnorm(40, mean = 10, sd = 1)
variable <- c(rep('A', 20), rep('B', 20))
group <- rep(c('Control', 'Disease'), 20)
data <- data.frame(value, variable, group)

ggplot(data, aes(x=variable, y=value)) +
  geom_boxplot(aes(fill=group)) +
  geom_point(aes())

This divides up boxplots into groups by variable the way I'd like. However, the points for all groups are overlaid, and I'd like for it to be divided up into groups. How would I go about doing this?

sudo make install
  • 5,629
  • 3
  • 36
  • 48

4 Answers4

25

Use position_dodge() for the points and also add group=group inside aes() of geom_point().

ggplot(data, aes(x=variable, y=value)) +
  geom_boxplot(aes(fill=group)) +
  geom_point(position=position_dodge(width=0.75),aes(group=group))

enter image description here

Didzis Elferts
  • 95,661
  • 14
  • 264
  • 201
  • 2
    This doesn't work as expect if one of the groups has no points; for that group, the points will be misaligned. – Ryan Hope May 11 '17 at 18:35
  • @RyanHope This was answer for the particular question. It is not possible to include all the possbile situations in one answer – Didzis Elferts May 11 '17 at 18:48
  • 1
    It works with the data he provides but is not a perfect solution to the general problem he proposed. I am just pointing this out because I hit this issue recently. – Ryan Hope May 12 '17 at 16:40
  • @RyanHope Can you provide example of such data because if I remove all data for A Disease plot is made as expected. – Didzis Elferts May 12 '17 at 17:01
23

I don't know when this was introduced, but there is a new(ish) featured called position_jitterdodge, which simplifies this, whether you want jittering or not. Usage:

ggplot(data, aes(x=variable, y=value, fill=group)) +
  geom_boxplot() +
  geom_point(position=position_jitterdodge())
  # or, if you dont need jittering
  # geom_point(position=position_jitterdodge(jitter.width = 0, jitter.height = 0)) 

jittered overlay

http://ggplot2.tidyverse.org/reference/position_jitterdodge.html

NWaters
  • 1,163
  • 1
  • 15
  • 27
2

You can try the ggbeeswarm as well. Here I compare the output of geom_beeswarm and geom_quasirandom:

library(ggbeeswarm)
library(ggplot2)

ggplot(data, aes(x=variable, y=value, fill=group)) +
  geom_boxplot() +
  geom_beeswarm(dodge.width=0.75) +
  geom_quasirandom(dodge.width=.75, col=2) 

enter image description here

Roman
  • 17,008
  • 3
  • 36
  • 49
-1

Here is an attempt to apply Didzis's suggestion to a dataset where not all groups have an outlier and thus the points don't line up with the correct box. enter image description here

Data file: https://cmu.box.com/shared/static/2hxp2oms5et1ktr9hukdr539b6svq1cg.rds

require(data.table)
require(ggplot2)

d <- readRDS("2hxp2oms5et1ktr9hukdr539b6svq1cg.rds")

ggplot(d) + 
  geom_boxplot(aes(x=factor(game),y=N,fill=.group),outlier.shape=NA) + 
  geom_point(data=dd.sum1[,.SD[which(N %in% boxplot.stats(N)$out)],by=game][,.group:=factor(.group,levels=.groups)][,],aes(x=factor(game),y=N,color=.group,group=.group),
             position=position_dodge(width=0.75),
             size=.5) +
  facet_grid (type~., scales="free_x", space="free_x") +
  xlab("Game number") +
  ylab("Count") +
  scale_color_brewer("Group",palette="Set1",drop=FALSE) +
  scale_fill_brewer("Group",palette="Set1",drop=FALSE) +
  theme(axis.text.x=element_text(size=7),legend.position="top")
Ryan Hope
  • 502
  • 2
  • 14