2

With the R code below,

library(ggplot2)
library(ggridges)
ggplot(iris, aes(x = Sepal.Length, y = Species)) +
  stat_density_ridges(quantile_lines = TRUE, quantiles = c(0.025, 0.975), alpha = 0.7)

I get the following ridge plot

enter image description here

I would like to replace the two quantile lines (2.5% and 97.5%) under each density plot with two different lines based on the following code:

require(coda)
aggregate(Sepal.Length ~ Species, iris, function(x) HPDinterval(as.mcmc(x), 0.975))

In other words, I wish to put those vertical lines at the following locations:

     Species Sepal.Length.1 Sepal.Length.2
1     setosa            4.3            5.8
2 versicolor            4.9            7.0
3  virginica            4.9            7.9

How to achieve this?

bluepole
  • 323
  • 1
  • 12

1 Answers1

3

The inverse of quantile is ecdf, so you can back-transform your lengths into quantiles to feed to stat_density_ridges:

aggregate(Sepal.Length ~ Species, iris, function(x) {
  ecdf(x)(HPDinterval(as.mcmc(x), 0.975))})
#>      Species Sepal.Length.1 Sepal.Length.2
#> 1     setosa           0.02           1.00
#> 2 versicolor           0.02           1.00
#> 3  virginica           0.02           1.00

And you will see that the quantiles you are looking for are 0.02 and 1, so now you can plot:

ggplot(iris, aes(x = Sepal.Length, y = Species)) +
  stat_density_ridges(quantile_lines = TRUE, quantiles = c(0.02, 1), 
                      alpha = 0.7)

enter image description here

Note also that the function you are passing to aggregate is the same as passing range

aggregate(Sepal.Length ~ Species, iris, range)
#>      Species Sepal.Length.1 Sepal.Length.2
#> 1     setosa            4.3            5.8
#> 2 versicolor            4.9            7.0
#> 3  virginica            4.9            7.9

So you could simply pass 0 and 1 as quantiles, which matches the numbers in your table perfectly:

ggplot(iris, aes(x = Sepal.Length, y = Species)) +
    stat_density_ridges(quantile_lines = TRUE, quantiles = c(0, 1), 
                        alpha = 0.7)

enter image description here


Edit

If the quantiles returned by your function are all different, you could use multiple geom layers, which you could have in a list, like this:

library(ggridges)

ggplot(iris, aes(x = Sepal.Length, y = Species)) +
  lapply(split(iris, iris$Species)[3:1], function(x) {
  stat_density_ridges(quantile_lines = TRUE, 
  quantiles = ecdf(x$Sepal.Length)(HPDinterval(as.mcmc(x$Sepal.Length), 0.75)), 
  alpha = 0.7, data = x)})

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Very nice. Thanks! How about if I want to put those vertical lines at `aggregate(Sepal.Length ~ Species, iris, function(x) {ecdf(x)(HPDinterval(as.mcmc(x), 0.75))})`? – bluepole Sep 16 '22 at 17:04
  • 1
    If your quantiles are all different, then you will need mutiple layers - see my update. – Allan Cameron Sep 16 '22 at 18:41
  • Thanks a lot! Is there a way to avoid using multiple layers? I may have 20 or more different species, and in that case it would become quite clunky to code it in a layer-wise fashion. – bluepole Sep 16 '22 at 19:05
  • 1
    @bluepole I'm sure there are a few different ways to do it, though if you want to automate it, your two options are either to create all the layers in a list, or to calculate the density heights and draw them in as segments. Both of these are tricky, but possible. I can have a look at this when I get the chance. – Allan Cameron Sep 16 '22 at 19:21
  • 1
    @bluepole actually, it's easier than I thought. See my update. – Allan Cameron Sep 16 '22 at 19:35