0

I have a dataframe that I'd like to plot in a violin plot, where the observations are binned into one of two groups: [va_AC == 1] or [va_AC > 1].

Is there a way to do this without adding a new column to df to assign it a group?

enter image description here

    ggplot(df) + 
      geom.violin(aes(x=[va_AC == 1] or [va_AC > 1], 
                      y=g_AD))
Carmen Sandoval
  • 2,266
  • 5
  • 30
  • 46

1 Answers1

4

If you don't want to change your initial dataset, you can do it in a piped workflow, so it won't change the df object in your environment. Here I use case_when to chose the split (although I'm sure there are many other ways).

#create reproducible data

df <- data.frame(va_AC = c(rep(1,20), runif(80, 1.0001, 100)), 
                 g_AD = rnorm(100,25,5))

library(dplyr)
#combine in a pipe
df %>%
  #create new grouped variable
  mutate(split = case_when(
    va_AC == 1 ~ "A", 
    va_AC > 1 ~ "B"
  )) %>%
  #plot as before
  ggplot(.) +
  geom_violin(aes(x = split, y = g_AD))
m.evans
  • 606
  • 3
  • 15
  • Since there's only two levels to the split variable, you could also simplify to just an `ifelse`. For more than two levels, finding `case_when` was such a quality of life improvement for me – camille Apr 19 '18 at 16:45