0

I am using an r data frame which looks something like:

df = data_frame(
A=c(sample(0:2,size=50,replace=TRUE)),
A1=c(sample(0:50,size=50,replace=TRUE)),
A2=c(sample(0:50,size=50,replace=TRUE)))

I am trying to produce a grouped dot plot of A1 and A1, grouped by A, e.g:

Grouped Plot

To produce this I've converted the data from wide to long format using the below code and then utlised position_dodge

library(reshape2)
df2 <- melt(df, id.vars=c("A"))

I'm trying to work out how to draw the equivalent graph without converting the format. My current code is:

library(ggplot2)
ggplot(df,aes(x=as.factor(A)))+
    xlab("A") + 
    ylab("Score") +
    geom_dotplot (aes(y=A1), binaxis="y",binwidth=1, stackdir = "center",fill="#5ec962",colour="#5ec962") + 
    stat_summary(aes(y=A1),fun = median, fun.min = median, fun.max = median, geom = "crossbar", width = 0.2,colour="#5ec962") +
    geom_dotplot (aes(y=A2), binaxis="y",binwidth=1, stackdir = "center",fill="#3b528b",colour="#3b528b") + 
    stat_summary(aes(y=A2),fun = median, fun.min = median, fun.max = median, geom = "crossbar", width = 0.2,colour="#3b528b") +
    theme_classic()

This has the two groups on top of each other like this:

Ungrouped plot

Is there a way to produce a graph like the first one with the groups side by side without converting the format? I'm planning to look at multiple comparison and would ideally not keep doing conversions.

Thank you.

Jon Spring
  • 55,165
  • 4
  • 35
  • 53
beanie42
  • 1
  • 2
  • 4
    one of the principles in R (and probably most statistical languages, I only know R) is that the shape of your data needs to fit your methods, and not the other way round. Fear not, the more experienced you get, the less you will be bothered by reshaping the data. In fact, it will become second nature to you. – tjebo Jun 30 '22 at 12:32
  • Having said that - your data should ideally be long for your graph. your way with reshape is fine, there are more modern ways including tidyr::pivot_longer - this depends on your preference. – tjebo Jun 30 '22 at 12:34
  • 3
    I heartily agree with @tjebo here. Yes, there are ways to draw the plot without reshaping your data, but it will require a lot more code, and is quite 'hacky'. Remember, you don't need to store your long data frame each time. Just reshape it and pass it by the pipe to the first argument of ggplot. Consider it part of your plotting code - it will only require one extra line. Trying to avoid converting to long format will lead to several extra lines of code, duplication, and difficulty debugging. – Allan Cameron Jun 30 '22 at 12:37
  • you could define the position with `position_dodge` before you plot, but as @AllanCameron and @tjebo said, long format is the easy and proper way to do it. – denis Jun 30 '22 at 12:47
  • Thank you all in particular @AllanCameron, with the idea to pipe the reshape into the first argument which has significantly simplified things – beanie42 Jun 30 '22 at 13:26

0 Answers0