3

I want to draw boxplots with the number of observations on top. The problem is that depending on the information and the outliers, the y-axis changes. For that reason, I want to change the limits of scale_y_continuous automatically. Is it possible to do this?

This is a reproducible example:

library(dplyr)
library(ggplot2)

myFreqs <- mtcars %>%  
  group_by(cyl, am) %>% 
  summarise(Freq = n()) 
myFreqs

p <- ggplot(mtcars, aes(factor(cyl), drat, fill=factor(am))) +
  stat_boxplot(geom = "errorbar") +
  geom_boxplot() +
  stat_summary(geom = 'text', label = paste("n = ", myFreqs$Freq), fun = max, position = position_dodge(width = 0.77), vjust=-1)

p

image 1

The idea is to increase at least +1 to the maximum value of the plot with the highest y-axis value (in the case explained above, it would be the second boxplot with n=8)

I have tried to change the y-axis with scale_y_continuous like this:

p <- p + scale_y_continuous(limits = c(0, 5.3))
p

image 2

However, I don't want to put the limits myself, I want to find a way to modify the limits according to the plots that I have. (Because... what if the information changes?). Is there a way to do something like this? With min and max --> scale_y_continuous(limits = c(min(x), max(x)))

Thanks very much in advance

caldwellst
  • 5,719
  • 6
  • 22
emr2
  • 1,436
  • 7
  • 23
  • 2
    The `limits` argument accepts a function so you can do `scale_y_continuous(limits = function(x){c(min(x), max(x)})`. Note that the input provided as `x` are the natural limits of the data, so this particular function would change nothing. – teunbrand Jan 27 '22 at 11:54
  • 1
    Yeah, in your case, you're probably looking for something like `p + scale_y_continuous(limits = ~ c(0, max(.x) + 0.4))`. – caldwellst Jan 27 '22 at 11:57
  • Thanks very much for your answers! That is exactly what I needed it! @caldwellst what does `(.x)` do? Is it like a reduced version of the function that @teunbrand has written? – emr2 Jan 27 '22 at 12:06
  • 1
    Yeah, it's the lambda format used by `tidyverse` for anonymous functions, instead of `function(x)`, it basically represnts `function(.x)`. If using R >= 4.1, you can also use `\(x)` as shorthand for `function(x)` – caldwellst Jan 27 '22 at 12:08
  • Wow, thanks very much for the information and your help! @caldwellst – emr2 Jan 27 '22 at 12:09

1 Answers1

4

Thanks to @teunbrand and @caldwellst I got the solution that I needed it.

There are 3 solutions that work perfectly:

1-

p + scale_y_continuous(limits = function(x){
  c(min(x), (max(x)+0.1))
    })
p

2-

library(tidyverse)

p + scale_y_continuous(limits = ~ c(min(.x), max(.x) + 0.1))

3-

p + scale_y_continuous(limits = function(x){
  c(min(x), ceiling(max(x) * 1.1))
})
emr2
  • 1,436
  • 7
  • 23
  • Was also looking for this. How would it need to be edited if I wanted to set the y-axis breaks? – Jackson A Swan Sep 16 '22 at 18:59
  • @JacksonASwan the function has a parameter to set the breaks. Check this https://www.geeksforgeeks.org/set-axis-breaks-of-ggplot2-plot-in-r/ – emr2 Sep 28 '22 at 06:55