5

Let's condsider the bagplot example as included in the aplpack library in R. A bagplot is a bivariate generalisation of a boxplot and therefore gives insight in the distribution of data points in both axes.

Example of a bagplot: car data bagplot

Code for the example:

  # example of Rousseeuw et al., see R-package rpart
  cardata <- structure(as.integer( c(2560,2345,1845,2260,2440,
   2285, 2275, 2350, 2295, 1900, 2390, 2075, 2330, 3320, 2885,
   3310, 2695, 2170, 2710, 2775, 2840, 2485, 2670, 2640, 2655,
   3065, 2750, 2920, 2780, 2745, 3110, 2920, 2645, 2575, 2935,
   2920, 2985, 3265, 2880, 2975, 3450, 3145, 3190, 3610, 2885,
   3480, 3200, 2765, 3220, 3480, 3325, 3855, 3850, 3195, 3735,
   3665, 3735, 3415, 3185, 3690, 97, 114, 81, 91, 113, 97, 97,
   98, 109, 73, 97, 89, 109, 305, 153, 302, 133, 97, 125, 146,
   107, 109, 121, 151, 133, 181, 141, 132, 133, 122, 181, 146,
   151, 116, 135, 122, 141, 163, 151, 153, 202, 180, 182, 232,
   143, 180, 180, 151, 189, 180, 231, 305, 302, 151, 202, 182,
   181, 143, 146, 146)), .Dim = as.integer(c(60, 2)), 
   .Dimnames = list(NULL, c("Weight", "Disp.")))
  bagplot(cardata,factor=3,show.baghull=TRUE,
    show.loophull=TRUE,precision=1,dkmethod=2)
  title("car data Chambers/Hastie 1992")
  # points of y=x*x
  bagplot(x=1:30,y=(1:30)^2,verbose=FALSE,dkmethod=2)

The bagplot of aplpack seems to only support plotting a "bag" for a single data series. Even more interesting would be to plot two (or three) data series within a single bagplot, where visually comparing the "bags" of the data series gives insight in the differences in the data distributions of the data series. Does anyone know if (and if so, how) this can be done in R?

Ben
  • 41,615
  • 18
  • 132
  • 227
Niek Tax
  • 841
  • 1
  • 11
  • 30
  • `bagplot` has an add parameter. This is have been something you could have answered by a more careful reading the help page. – IRTFM Apr 07 '15 at 21:24
  • 1
    Also consider faceting it, via `par('mfrow')`, `par('mfcol')`, `layout()`, or `par('fig')`. – r2evans Apr 07 '15 at 22:00

1 Answers1

10

If we modify some of the aplpack::bagplot code we can make a new geom for ggplot2. Then we can compare groups within a dataset in the usual ggplot2 ways. Here's one example:

library(ggplot2)
ggplot(iris, aes(Sepal.Length, Sepal.Width, 
                 colour = Species, fill = Species)) +
       geom_bag() +
       theme_minimal()

enter image description here

and we can show the points with the bagplot:

ggplot(iris, aes(Sepal.Length, Sepal.Width, 
                     colour = Species, fill = Species)) +
           geom_bag() +
           geom_point() + 
           theme_minimal()

enter image description here

Here's the code for the geom_bag and modified aplpack::bagplot function: https://gist.github.com/benmarwick/00772ccea2dd0b0f1745

Ben
  • 41,615
  • 18
  • 132
  • 227
  • What is the best way to go about installing this in R? – reas0n Jan 05 '18 at 20:59
  • 2
    @ElijahRockers I added some code to the gist to show how to load the functions so you can use them locally: https://gist.github.com/benmarwick/00772ccea2dd0b0f1745#file-002_bag_demo-r-L1 – Ben Jan 07 '18 at 07:39