I am trying to plot data that is grouped by one variable into boxplots using ggplot2, then I want to distinguish each data point plotted by their replicate using different symbols.
My data:
Cell_line<- rep(c("B1", "B2","C"), each=4)
Condition<- rep(c("1", "2", "3", "4"), times=3)
rep1 <- c(250.1202269, NA, 87.78025978, 103.7252853, 131.3835253, NA, 168.8831935, 135.5137408, 137.9377942, NA, 48.73955206, 73.48705161)
rep2<- c(176.5811282, 165.4414077, 58.18896416, 52.48947013, 214.1871341, 200.8850097, 312.473565, 194.1484832, 221.5290924, 208.2391158, 107.2347819, 81.38548616)
rep3 <- c(125.0917574, 71.3834596, 40.42846894, 22.41081706, 128.4170654, 114.8438056, 150.7904802, 112.1023294, 99.56769695, 135.9090866, 93.05268714, 39.17564189)
df <- data.frame(Cell_line, Condition, rep1, rep2, rep3)
I can plot it fine without the different symbols using geom_beeswarm to add the points:
df %>%
pivot_longer(cols = rep1:rep3, names_to = "replicate", values_to = "expression") %>%
mutate(Condition = fct_relevel(Condition,
"1", "2", "3", "4")) %>%
ggplot(aes(x=Condition, y=expression, colour = Cell_line))+
geom_boxplot()+
geom_beeswarm(dodge.width=0.75, size=2.5)
(https://i.stack.imgur.com/SvnWv.png)
Everything is fine until I try to change the symbols, using geom_point, where the points are scattered instead of lining up along the centre of their respective boxplot.
df %>%
pivot_longer(cols = rep1:rep3, names_to = "replicate", values_to = "expression") %>%
mutate(Condition = fct_relevel(Condition,
"1", "2", "3", "4")) %>%
ggplot(aes(x=Condition, y=expression, colour = Cell_line))+
geom_boxplot()+
geom_point(aes(colour=Cell_line, shape = replicate), position=position_dodge(width=1), size=3)+
scale_shape_manual(values=c(15, 16, 17))
(https://i.stack.imgur.com/QAkrS.png)
How can I fix this so it appears like the first plot except with different symbols?