5

NOTE: I have updated this post following discussion with Z. Lin. Originally, I had simplified my problem to a two factor design (see section "Original question"). However, my actual data consists of four factors, requiring facet_grid. I am therefore providing an example for a four factor design further below (see section "Edit").

Original question

Let's assume I have a two factor design with dv as my dependent variable and iv.x and iv.y as my factors/independent variables. Some quick sample data:

DF <- data.frame(dv = rnorm(900), 
                 iv.x = sort(rep(letters[1:3], 300)), 
                 iv.y = rep(sort(rep(rev(letters)[1:3], 100)), 3))

My goal is to display each condition separately as can nicely be done with violin plots:

ggplot(DF, aes(iv.x, dv, colour=iv.y)) + geom_violin()  

I have recently come across Sina plots and would like to do the same here. Unfortunately Sina plots don't do this, collapsing the data instead.

ggplot(DF, aes(iv.x, dv, colour=iv.y)) + geom_sina()

An explicit call to position dodge doesn't help either, as this produces an error message:

ggplot(DF, aes(iv.x, dv, colour=iv.y)) + geom_sina(position = position_dodge(width = 0.5))

The authors of Sina plots have already been made aware of this issue in 2016: https://github.com/thomasp85/ggforce/issues/47

My problem is more in terms of time. We soon want to submit a manuscript and Sina plots would be a great way to display our data. Can anyone think of a workaround for Sina plots such that I can still display two factors as in the example with violin plots above?

Edit

Sample data for a four factor design:

    DF <- data.frame(dv=rnorm(400), 
             iv.w=sort(rep(letters[1:2],200)),
             iv.x=rep(sort(rep(letters[3:4],100)), 2),
             iv.y=rep(sort(rep(rev(letters)[1:2],50)),4),
             iv.z=rep(sort(rep(letters[5:6],25)),8))

An example with violin plots of what I would like to create using Sina plots:

    ggplot(DF, aes(iv.x, dv, colour=iv.y)) + 
      facet_grid(iv.w ~ iv.z) +
      geom_violin(aes(y = dv, fill = iv.y), 
          position = position_dodge(width = 1))+
      stat_summary(aes(y = dv, fill = iv.y), fun.y=mean, geom="point", 
          colour="black", show.legend = FALSE, size=.2, 
          position=position_dodge(width=1))+
      stat_summary(aes(y = dv, fill = iv.y), fun.data=mean_cl_normal, geom="errorbar", 
          position=position_dodge(width=1), width=.2, show.legend = FALSE,
          colour="black", size=.2) 
Tiberius
  • 331
  • 1
  • 9

1 Answers1

3

Edited solution, since OP clarified that facets are required:

ggplot(DF, aes(x = interaction(iv.y, iv.x), 
               y = dv, fill = iv.y, colour = iv.y)) + 
  facet_grid(iv.w ~ iv.z) +      
  geom_sina() +
  stat_summary(fun.y=mean, geom="point", 
               colour="black", show.legend = FALSE, size=.2, 
               position=position_dodge(width=1))+
  stat_summary(fun.data=mean_cl_normal, geom="errorbar", 
               position=position_dodge(width=1), width=.2, 
               show.legend = FALSE,
               colour="black", size=.2) +
  scale_x_discrete(name = "iv.x", 
                   labels = c("c", "", "d", "")) +
  theme(panel.grid.major.x = element_blank(),
        axis.text.x = element_text(hjust = -4),
        axis.ticks.x = element_blank())

Instead of using facets to simulate dodging between colours, this approach creates a new variable interaction(colour.variable, x.variable) to be mapped to the x-axis.

The rest of the code in scale_x_discrete() & theme() are there to hide the default x-axis labels / ticks / grid lines.

axis.text.x = element_text(hjust = -4) is a hack that shifts x-axis labels to approximately the right position. It's ugly, but considering the use case is for a manuscript submission, I assume the size of plots will be fixed, and you just need to tweak it once.

edited solution

Original solution:

Assuming your plots don't otherwise require facetting, you can simulate the appearance with facets:

ggplot(DF, aes(x = iv.y, y = dv, colour = iv.y)) +
  geom_sina() + 
  facet_grid(~iv.x, switch = "x") +
  labs(x = "iv.x") +
  theme(axis.text.x = element_blank(),      # hide iv.y labels
        axis.ticks.x = element_blank(),     # hide iv.y ticks
        strip.background = element_blank(), # make facet strip background transparent
        panel.spacing.x = unit(0, "mm"))    # remove horizontal space between facets

plot

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
  • Many thanks for this quick and excellent workaround, Z. Lin. Our actual design includes 4 factors and we do use face_grids to display the remaining factors: `DF <- data.frame(dv=rnorm(400), iv.w=sort(rep(letters[1:2],200)), iv.x=rep(sort(rep(letters[3:4],100)), 2), iv.y=rep(sort(rep(rev(letters)[1:2],50)),4), iv.z=rep(sort(rep(letters[5:6],25)),8) )` Is this possible as well? – Tiberius May 18 '18 at 13:30
  • @Marcel Can you update your question with this info + what the desired final product looks like (e.g. with `geom_violin`)? I can try to update my answer with something close. – Z.Lin May 18 '18 at 13:38
  • Thank you, I have updated my question with an example for a 4 factor design accordingly. – Tiberius May 18 '18 at 15:05
  • Genius! Thanks :) – Tiberius May 20 '18 at 08:14