2

I'm trying to reproduce this chart from Our World in Data.

enter image description here

I'm searching for methods that will get the line labels to look as close as possible to the original. Here's what I've got so far (shows the ggrepel() version, see commented out line for alternate):

library(tidyverse)
library(ggrepel)
keep <- c("Israel", "United Arab Emirates", "United Kingdom",
          "United States", "Chile", "European Union", "China",
          "Russia", "Brazil", "World", "Mexico", "Indonesia",
          "Bangladesh")

owid <- read_csv("https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/vaccinations.csv") %>%
  filter(location %in% keep) %>%
  filter(date >= "2021-01-01" & date <= "2021-02-12") %>%
  select(location, date, total_vaccinations_per_hundred) %>%
  arrange(location, date) %>%
  group_by(location) %>%
  complete(date = seq.Date(as.Date("2021-01-01"), 
                           as.Date("2021-02-12"), 
                           by="day")) %>%
  fill(total_vaccinations_per_hundred) %>%
  ungroup() %>%
  mutate(location = factor(location),
         location = fct_reorder2(location, total_vaccinations_per_hundred,
                                 total_vaccinations_per_hundred)) %>%
  mutate(label = if_else(date == max(date), 
                         as.character(location), 
                         NA_character_))

owid %>%
  ggplot(aes(x=date, y=total_vaccinations_per_hundred, group=location,
                 color=location)) +
  geom_point() + 
  geom_line() +
  scale_y_continuous(breaks=c(seq(0, 70, 10))) +
  theme_minimal() + 
  labs(title = "Cumulative COVID-19 vaccination doses administered per 100 people",
       subtitle = "This is counted as a single dose, and may not equal the total number of people vaccinated, depending on the specific dose regime (e.g. people receive multiple doses).",
       caption = "Source: Official data collected by Our World in Data — Last updated 13 February, 11:40 (London time)",
       y="",
       x="") +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.major.y = element_line(linetype = "dashed"),
        panel.grid.minor.y = element_blank(),
        panel.grid.minor.x = element_blank(),
        plot.title.position = "plot",
        plot.title = element_text(face="bold"),
        legend.position = "none") +
  geom_label_repel(aes(label = label),
                   nudge_x = 1,
                   hjust = "left", direction="y",
                   na.rm = TRUE) +
  #geom_label(aes(label = label), hjust=0, nudge_x = 1) +
  scale_x_date(breaks = as.Date(c("2021-01-01",
                                  "2021-01-10",
                                  "2021-01-15",
                                  "2021-01-20",
                                  "2021-01-25",
                                  "2021-01-30",
                                  "2021-02-04",
                                  "2021-02-12")),
               labels = scales::date_format("%b %d"),
               limits = as.Date(c("2021-01-01",
                                  "2021-03-01"))) 
Eric Green
  • 7,385
  • 11
  • 56
  • 102
  • Possibly related: https://stackoverflow.com/a/47983622/11374827 – teunbrand Feb 22 '21 at 15:36
  • Thanks, @teunbrand. Drawing a second axis should probably be in the mix as a potential solution. The suggestion comes from 2017, and I feel like there have been a lot of ggplot text support developments since then. Hoping it's possible to do without combining plots. – Eric Green Feb 22 '21 at 15:46
  • 1
    I'm unsure about the text improvements. Ideally, the `ggrepel` package would support an axis that repels closely spaced labels, but I don't think they do. Plot composition has become much better with patchwork though. – teunbrand Feb 22 '21 at 15:48
  • 1
    Yes, it is possible, but it could be a bit tricky. Please see https://ggrepel.slowkow.com/articles/examples.html#make-curved-line-segments-or-arrows-1. You can play with arguments `segment.inflect`, `segment.square`, `segment.curvature`, etc – gavg712 Feb 22 '21 at 16:31

1 Answers1

8

Here is a lazy but consistent trick: Plot two geom_text_repel(). The first one with (a) an space (" ") for text, and (1) the links in color, the second one with the (b) actual label text, and (2) the links with complete transparency (i.e. segment.alpha = 0). This trick will force the rightmost end of the link towards the position of the first letter of the second label.

Replicating your code up to the geom_repels:

 G01 <-  
  owid %>%
  ggplot(aes(x=date, y=total_vaccinations_per_hundred, group=location,
             color=location)) +
  geom_point() + 
  geom_line() +
  scale_y_continuous(breaks=c(seq(0, 70, 10))) +
  scale_x_date(limits = as.Date(c("2021-01-01", "2021-02-25"))) +
  theme_minimal() + 
  labs(title = "Cumulative COVID-19 vaccination doses administered per 100 people",
       subtitle = "This is counted as a single dose, and may not equal the total number of people vaccinated, depending on the specific dose regime (e.g. people receive multiple doses).",
       caption = "Source: Official data collected by Our World in Data — Last updated 13 February, 11:40 (London time)",
       y="",
       x="") +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.major.y = element_line(linetype = "dashed"),
        panel.grid.minor.y = element_blank(),
        panel.grid.minor.x = element_blank(),
        plot.title.position = "plot",
        plot.title = element_text(face="bold"),
        legend.position = "none") +
scale_x_date(breaks = as.Date(c("2021-01-01",
                                "2021-01-10",
                                "2021-01-15",
                                "2021-01-20",
                                "2021-01-25",
                                "2021-01-30",
                                "2021-02-04",
                                "2021-02-12")),
             labels = scales::date_format("%b %d"),
             limits = as.Date(c("2021-01-01",
                                "2021-03-01")))

Adding the two custom geom_text_repels:

   G01 +
  geom_text_repel(aes(label = gsub("^.*$", " ", label)), # This will force the correct position of the link's right end.
                  segment.curvature = -0.1,
                  segment.square = TRUE,
                  segment.color = 'grey',
                  box.padding = 0.1,
                  point.padding = 0.6,
                  nudge_x = 0.15,
                  nudge_y = 1,
                  force = 0.5,
                  hjust = 0,
                  direction="y",
                  na.rm = TRUE, 
                  xlim = as.Date(c("2021-02-16", "2021-03-01")),
                  ylim = c(0,73.75),
  ) +
  geom_text_repel(data = . %>% filter(!is.na(label)),
                  aes(label = paste0("  ", label)),
                  segment.alpha = 0, ## This will 'hide' the link
                  segment.curvature = -0.1,
                  segment.square = TRUE,
                  # segment.color = 'grey',
                  box.padding = 0.1,
                  point.padding = 0.6,
                  nudge_x = 0.15,
                  nudge_y = 1,
                  force = 0.5,
                  hjust = 0,
                  direction="y",
                  na.rm = TRUE, 
                  xlim = as.Date(c("2021-02-16", "2021-03-01")),
                  ylim = c(0,73.75))

enter image description here

Nicolás Velasquez
  • 5,623
  • 11
  • 22