1

I'd like to make a boxplot with mean instead of median. Moreover, I would like the line to stop at 5% (lower) end 95% (upper) quantile. Here the code;

ggplot(data, aes(x=Cement, y=Mean_Gap, fill=Material)) +
geom_boxplot(fatten = NULL,aes(fill=Material), position=position_dodge(.9)) +
xlab("Cement") + ylab("Mean cement layer thickness") +
stat_summary(fun=mean, geom="point", aes(group=Material), position=position_dodge(.9),color="black")

I'd like to change geom to errorbar, but this doesn't work. I tried middle = mean(Mean_Gap), but this doesn't work either. I tried ymin = quantile(y,0.05), but nothing was changing. Can anyone help me?

The standard boxplot using ggplot. fill is Material: The standard boxplot using ggplot. fill is Material

Jeremy Caney
  • 7,102
  • 69
  • 48
  • 77
Linda Vos
  • 13
  • 1
  • 3

1 Answers1

0

Here is how you can create the boxplot using custom parameters for the box and whiskers. It's the solution shown by @lukeA in stackoverflow.com/a/34529614/6288065, but this one will also show you how to make several boxes by groups.

The R built-in data set called "ToothGrowth" is similar to your data structure so I will use that as an example. We will plot the length of tooth growth (len) for each vitamin C supplement group (supp), separated/filled by dosage level (dose).

# "ToothGrowth" at a glance
head(ToothGrowth)
#   len supp dose
#1  4.2   VC  0.5
#2 11.5   VC  0.5
#3  7.3   VC  0.5
#4  5.8   VC  0.5
#5  6.4   VC  0.5
#6 10.0   VC  0.5


library(dplyr)

# recreate the data structure with specific "len" coordinates to plot for each group
df <- ToothGrowth %>% 
    group_by(supp, dose) %>% 
    summarise(
        y0 = quantile(len, 0.05), 
        y25 = quantile(len, 0.25), 
        y50 = mean(len), 
        y75 = quantile(len, 0.75), 
        y100 = quantile(len, 0.95))

df
## A tibble: 6 x 7
## Groups:   supp [2]
#  supp   dose    y0   y25   y50   y75  y100
#  <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 OJ      0.5  8.74  9.7  13.2   16.2  19.7
#2 OJ      1   16.8  20.3  22.7   25.6  26.9
#3 OJ      2   22.7  24.6  26.1   27.1  30.2
#4 VC      0.5  4.65  5.95  7.98  10.9  11.4
#5 VC      1   14.0  15.3  16.8   17.3  20.8
#6 VC      2   19.8  23.4  26.1   28.8  33.3

# boxplot using the mean for the middle and 95% quantiles for the whiskers
ggplot(df, aes(supp, fill = as.factor(dose))) +
    geom_boxplot(
        aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100),
        stat = "identity"
    ) + 
    labs(y = "len", title = "Boxplot with Mean Middle Line") + 
    theme(plot.title = element_text(hjust = 0.5))

Standard boxplot vs Mean-based boxplot

In the figure above, the boxplot on the left is the standard boxplot with regular median line and regular min/max whiskers.
The boxplot on the right uses the mean middle line and 5%/95% quantile whiskers.

LC-datascientist
  • 1,960
  • 1
  • 18
  • 32
  • The code above results in an error – treetopdewdrop Apr 13 '22 at 18:08
  • @treetopdewdrop I'm not getting an error, unless you're referring to the "ggplot" function not found. The OP is already using the "ggplot" function so I assumed the OP already loaded the "ggplot2" package to run `ggplot()`. If you are getting this error, you need to run `library(ggplot2)`. If you don't have "ggplot2", you can run `install.packages("ggplot2")` first. – LC-datascientist Apr 23 '22 at 18:45