How does R calculate boxplot.stats compared to summary/Quantile for customized Boxplot Hinges and Wiskers?

Question

I was trying to create a boxplot with a custom Wiskers range where the Wiskers alling with a 95% Range of the total values, and would have liked to do so with the boxplot(x, range = c(0.025, 0.975)) however this didn´t work and when I looked into using the range parameter in the documentation it uses it as a factor of the 50% IQR from the 1st Quantile (or the lower Hinge) to the 3rd Quantile (or upper Hinge). However this is when I realized that R was giving different Values for the 25% Quantile (1st Quantile in Summery funktion) and the lower Hinge from the Boxplot$stats read out. Just by a few decimal points but the difference is consistent exept for a few rare cases. I tried finding an answer online however was unable to find a comparison between the two funktions boxplot$stats and summary or quantile(25%, 75%). Also the IQR funktion gives the difference between Q3 and Q1 but not if taken from the boxplot$stats range.

Also if someone knows how to make a funktion like boxplot(data, range = c(0.5, 0.95)) work without resorting to creating a new variable to store the values of each quantile and then using ggplot to create the boxplot that would be great, as I have not found such a simple solution yet. I can obviously do this workaround but wanted to ask wheter I am overlooking a simpler way to custumize a simple boxplot and wiskers diagram.

This is a set of sample code for anyone to try. As you can see when you run the code, the difference between IQR(data) and Quant.Box$stats[4,1]-Quant.Box$stats[2,1] is not large, but it is existent. What is boxplot.stats calculating differently?

`

R-Code example`
set.seed(1984)
data <- rnorm(100, mean = 50, sd = 2)
str(data)
summary(data)
quantile(data, probs = c(0.25, 0.5, 0.75))
boxplot(data, range = c(1), plot = FALSE)
IQR(data)
quantile(data, probs = 0.75)-quantile(data, probs = 0.25)
Quant.Box <- boxplot(data, range = c(1), plot = FALSE)
Quant.Box$stats[4,1]-Quant.Box$stats[2,1]

Here the results from my Console:

> IQR(data)
[1] 2.871912
> quantile(data, probs = 0.75)-quantile(data, probs = 0.25)
     75% 
2.871912 
> Quant.Box <- boxplot(data, range = c(1), plot = FALSE)
> Quant.Box$stats[4,1]-Quant.Box$stats[2,1]
[1] 2.882812

Also concerning the question of how the wiskers of boxplot(data, range = c(1)) are calculated, as the dokumentation says it uses the parameter range and multiplies that with the IQR. The result from boxplot$stats[5,1] is [5,] 54.44820

> Quant.Box$stats[4,1]-Quant.Box$stats[2,1]+Quant.Box$stats[4,1]
[1] 54.50608
> IQR(data)+Quant.Box$stats[4,1]
[1] 54.49518
> IQR(data)+quantile(data, probs = 0.75)
     75% 
54.48453

However as you can see this leads to different results for the upper Wisker. What is going on here? What am I missing? I am not an expert with R and have only been using it for just over 6 months now, so I hope this question is still somewhat interesting to those who understand R better. Thank you for any help.

How does R calculate boxplot.stats compared to summary/Quantile for customized Boxplot Hinges and Wiskers?

0 Answers0