2

I am trying to produce a plot with a discontinuous y-axis but can't get the facet titles to only show once:

Example Data:

data(mpg)
library(ggplot2)
> ggplot(mpg, aes(displ, cty)) +
+     geom_point() +
+     facet_grid(. ~ drv)

After much digging it appears that this is impossible in ggplot2, but I have discovered the gg.gap package. However, this package replicates the facet titles for each segment of the plot. Let's say I want a break in the y axis from 22-32 as follows:

library(gg.gap)
gg.gap(plot = p,
       segments = c(22, 32),
       ylim = c(0, 35))

enter image description here

Facet titles appear for each plot segment but this is clearly pretty confusing and terrible aesthetically. I would be grateful for any insight of help anyone could provide! I'm stumped.

I know this is possible if I plot in base R, but given other constraints I am unable to do so (I need the graphics/grammar provided by ggplot2.

Thanks in advance!

DMC
  • 63
  • 8

2 Answers2

2

This is a bit of an ugly workaround. The idea is to set y-values in the broken portion to NA so that no points are drawn there. Then, we facet on a findInterval() with the breaks of the axes (negative because we want to preserve bottom-to-top axes). Finally we manually resize the panels with ggh4x::force_panelsizes() to set the 2nd panel to have 0 height. Full disclaimer, I wrote ggh4x so I'm biased.

A few details: the strips along the y-direction are hidden by setting the relevant theme elements to blank. Also, ideally you'd calculate what proportion the upper facet should be relative to the lower facet and replace the 0.2 by that number.

library(ggplot2)
library(ggh4x)

ggplot(mpg, aes(displ, cty)) +
  geom_point(aes(y = ifelse(cty >= 22 & cty < 32, NA, cty))) +
  facet_grid(-findInterval(cty, c(-Inf, 22, 32, Inf)) ~ drv, 
             scales = "free_y", space = "free_y") +
  theme(strip.background.y = element_blank(),
        strip.text.y = element_blank(),
        panel.spacing.y = unit(5.5/2, "pt")) +
  force_panelsizes(rows = c(0.2, 0, 1))
#> Warning: Removed 20 rows containing missing values (geom_point).

Alternative approach for boxplot:

Instead of censoring the bit on the break, you can duplicate the data and manipulate the position scales to show what you want. We rely on the clipping of the data by the coordinate system to crop the graphical objects.

library(ggplot2)
library(ggh4x)

ggplot(mpg, aes(class, cty)) +
  geom_boxplot(data = ~ transform(., facet = 2)) +
  geom_boxplot(data = ~ transform(., facet = 1)) +
  facet_grid(facet ~ drv, scales = "free_y", space = "free_y") +
  facetted_pos_scales(y = list(
    scale_y_continuous(limits = c(32, NA), oob = scales::oob_keep, # <- keeps data
                       expand = c(0, 0, 0.05, 0)),
    scale_y_continuous(limits=  c(NA, 21), oob = scales::oob_keep,
                       expand = c(0.05, 0, 0, 0))
  )) +
  theme(strip.background.y = element_blank(),
        strip.text.y = element_blank())

teunbrand
  • 33,645
  • 4
  • 37
  • 63
  • Thanks @teunbrand. Thinking ahead to future uses, is there any way to get this same result but without changing the non-represented data to `NA`? I'm thinking about bar, line, or box plots here. Changing the data to `NA` will cause problems in those instances. I asked the same of @JonSpring within the framework of his answer. I can see uses for each solution but am unsure how to expand either to a broader range of plots. – DMC Feb 19 '21 at 15:13
  • I've made an edit with a solution suitable for a boxplot, though the approach is a bit different. – teunbrand Feb 19 '21 at 16:41
  • @ teunbrand - Amazing! I am getting an error (`Error: 'oob_keep' is not an exported object from 'namespace:scales'`) however. This error appears even when I run your code as you have it in your box plot example. I tried updating my `scales` package but no luck. Any ideas? – DMC Feb 19 '21 at 22:09
  • It should be in scales v1.1.1 I think according to [their news](https://github.com/r-lib/scales/blob/master/NEWS.md). Here is the function on [github](https://github.com/r-lib/scales/blob/bb1c423004ec2e951ff77253e784f4c23af5a5e3/R/bounds.r#L276), it should be easy enough to copy. – teunbrand Feb 19 '21 at 22:15
2

Here's an approach that relies on changing the data before ggplot2, and then adjusting the scale labels, comparable to what you do for a secondary y axis.

library(dplyr)
low_max <- 22.5
high_min <- 32.5
adjust <- high_min - low_max

mpg %>%
  mutate(cty2 = as.numeric(cty),
         cty2 = case_when(cty < low_max ~ cty2,
                         cty > high_min ~ cty2 - adjust,
                         TRUE ~ NA_real_)) %>%
ggplot(aes(displ, cty2)) +
  geom_point() +
  annotate("segment", color = "white", size = 2,
       x = -Inf, xend = Inf, y = low_max, yend = low_max) +
  scale_y_continuous(breaks = 1:50, 
                     label = function(x) {x + ifelse(x>=low_max, adjust, 0)}) +
  facet_grid(. ~ drv)

enter image description here

Jon Spring
  • 55,165
  • 4
  • 35
  • 53
  • Thanks @jon! Thinking ahead to future uses, is there any way to get this same result but without changing the non-represented data to `NA`? I'm thinking about bar, line, or box plots here. Changing the data to `NA` will cause problems in those instances. I asked the same of @teunbrand within the framework of his answer. I can see uses for each solution but am unsure how to expand either to a broader range of plots. – DMC Feb 19 '21 at 15:14