6

I am using R to generate a Quarto document with figures and tables. This document should be rendered to create both an HTML and a PDF file. Most of it works just fine. However, I have figures with many legends and some of the legends are cut on the sides when generating the plot.

Legend does not fit in the figure

I have found solutions to resize the legend so that the all legends fit in the figure using this:

theme(legend.text = element_text(size = 6),
      legend.title = element_text(size = 11))

This gives a nice figure in the HTML document:

The legend fits in the HTML document

However, when I try to render the PDF document, here is what the figure looks like:

The problem is still not solved for PDF documents

Of course I could find solutions for the height of the figure, but I did not include code for that in the reproducible example I created. However, as can be seen, the legends are still cut.

Here a reproducible example of the Quarto document:

---
title: "Reproducible Example"
format:
  html:
    toc: true
  pdf:
    toc: true
---

This is a reproducible example to present my problem.

```{r}
library(tidyverse)
library(cowplot)
library(ggnewscale)
library(ggtext)
```

## Create data

```{r}
species_df <- tibble(fish_species = factor(x = c("Rainbow trout<br />(<i>Oncorhynchus mykiss</i>)", "Brown trout<br />(<i>Salmo trutta fario</i>)", "Whitefish<br />(<i>Coregonus sp.</i>)", "Grayling<br />(<i>Thymallus sp.</i>)", "Other salmonid<br />(other <i>Salmonidae</i>)", "Common perch<br />(<i>Perca fluviatilis</i>)", "Pikeperch<br />(<i>Sander lucioperca</i>)", "Other percid<br />(other <i>Percidae</i>)", "Koi<br />(<i>Cyprinus carpio</i>)", "Other carp<br />(other <i>Cyprinidae</i>)", "Freshwater ornamental fish<br />(diverse species)", "Saltwater ornamental fish<br />(diverse species)", "Crayfish<br />(<i>Crustacea</i>)", "Missing data"),
                                           levels = c("Rainbow trout<br />(<i>Oncorhynchus mykiss</i>)", "Brown trout<br />(<i>Salmo trutta fario</i>)", "Whitefish<br />(<i>Coregonus sp.</i>)", "Grayling<br />(<i>Thymallus sp.</i>)", "Other salmonid<br />(other <i>Salmonidae</i>)", "Common perch<br />(<i>Perca fluviatilis</i>)", "Pikeperch<br />(<i>Sander lucioperca</i>)", "Other percid<br />(other <i>Percidae</i>)", "Koi<br />(<i>Cyprinus carpio</i>)", "Other carp<br />(other <i>Cyprinidae</i>)", "Freshwater ornamental fish<br />(diverse species)", "Saltwater ornamental fish<br />(diverse species)", "Crayfish<br />(<i>Crustacea</i>)", "Missing data"))) |> 
  mutate(family = factor(x = case_when(fish_species %in% c("Rainbow trout<br />(<i>Oncorhynchus mykiss</i>)", "Brown trout<br />(<i>Salmo trutta fario</i>)", "Whitefish<br />(<i>Coregonus sp.</i>)", "Grayling<br />(<i>Thymallus sp.</i>)", "Other salmonid<br />(other <i>Salmonidae</i>)") ~ "Salmonid<br />(<i>Salmonidae</i>)",
                                       fish_species %in% c("Common perch<br />(<i>Perca fluviatilis</i>)", "Pikeperch<br />(<i>Sander lucioperca</i>)", "Other percid<br />(other <i>Percidae</i>)") ~ "Percid<br />(<i>Percidae</i>)",
                                       fish_species %in% c("Koi<br />(<i>Cyprinus carpio</i>)", "Other carp<br />(other <i>Cyprinidae</i>)") ~ "Cyprinid<br />(<i>Cyprinidae</i>)",
                                       fish_species %in% c("Freshwater ornamental fish<br />(diverse species)", "Saltwater ornamental fish<br />(diverse species)") ~ "Ornamental fish",
                                       fish_species %in% c("Crayfish<br />(<i>Crustacea</i>)") ~ "Crayfish<br />(<i>Crustacea</i>)",
                                       TRUE ~ "Other"),
                         levels = c("Salmonid<br />(<i>Salmonidae</i>)", "Percid<br />(<i>Percidae</i>)", "Cyprinid<br />(<i>Cyprinidae</i>)", "Ornamental fish", "Crayfish<br />(<i>Crustacea</i>)", "Other")),
         family_sober = factor(x = word(string = family,
                                        sep = "<br />"),
                               levels = word(string = levels(family),
                                             sep = "<br />")))

quartal <- paste("Quartal", 1:4)
year <- 2020:2022

quartal_df <- crossing(quartal, year) |> 
  mutate(quartal_year = factor(x = paste(year, quartal, sep = " - "),
                               levels = sort(paste(year, quartal, sep = " - ")))) |> 
  arrange(quartal_year) |> 
  mutate(quartal_num = seq_len(n())) |> 
  slice_tail(n = 9)

df <- species_df |> 
  crossing(quartal_df) |> 
  mutate(number = sample(x = 1:20, size = n(), replace = TRUE))
```

## Create plot

```{r}
# Prepare x axis breaks for ticks
quartal_breaks <- df |> 
  distinct(quartal_year, year) |> 
  group_by(year) |> 
  summarise(n_quartals = n()) |> 
  mutate(breaks = NA)

for (i in seq_len(nrow(quartal_breaks))) {
  
  quartal_breaks$breaks[i] <- 1 + sum(quartal_breaks$n_quartals[seq_len(i - 1)])
}



#Prepare colours
n_groups <- df |> distinct(family) |> nrow()
colour_group <- RColorBrewer::brewer.pal(name = "Dark2", n = n_groups)
colours <- c()

j <- 0

for (i in seq_len(n_groups)) {
  j <- j + 1
  
  n_in_group <- df |> filter(family == levels(df$family)[i]) |> distinct(fish_species) |> nrow()
    
  group_palette <- colorRampPalette(colors = c(colour_group[j], "#FFFFFF"))
    
  group_colours <- group_palette(n_in_group + 1) |> head(-1)
    
  colours <- append(colours, group_colours)
}

colours <- setNames(colours, df |> distinct(fish_species) |> pull(fish_species) |> sort())



#Create plot
fig <- ggplot(data = df) +
  geom_line(aes(x = quartal_num, y = number, colour = fish_species))

j <- 0

for (i in df |> distinct(family) |> arrange(family) |> pull()) {
  
  j <- j + 1
  
  fig <- fig +
    geom_line(aes(x = quartal_num, y = number, colour = fish_species)) +
    scale_colour_manual(aesthetics = "colour",
                        values = colours,
                        labels = df |> filter(family == i) |> distinct(fish_species) |> pull(fish_species),
                        breaks = df |> filter(family == i) |> distinct(fish_species) |> pull(fish_species),
                        name = i,
                        guide = guide_legend(title.position = "top", direction = "vertical", order = j)) +
    new_scale_colour()
}

fig <- fig +
  facet_wrap(vars(family_sober)) +
  scale_x_continuous(breaks = quartal_breaks$breaks,
                     labels = quartal_breaks$year,
                     minor_breaks = c(1:9)) +
  xlab("Time") +
  ylab("Number") +
  guides(color = guide_legend(override.aes = list(size = 0.8))) +
  theme(legend.position = "bottom",
        legend.text = element_markdown(size = 6),
        legend.key.height = unit(1.8, units = "char"),
        legend.margin = margin(t = 0, r = 0, b = 0, l = 0, unit='cm'),
        legend.spacing = unit(0.5, units = "char"),
        legend.title = element_markdown(size = 11),
        axis.text.x=element_text(angle=45, hjust=1, size = 7))



#Prepare plot to print
# fig_legend <- get_legend(fig)
# 
# fig_nolegend <- fig +
#   theme(legend.position = "none")
# 
# fig_print <- plot_grid(fig_nolegend,
#                        fig_legend,
#                        ncol = 1,
#                        rel_heights = c(3, 1))

print(fig)
# print(fig_print)
```

I have tried to use get_legend from cowplot to extract the legend and then combine 1) the figure without the legend (theme(legend.position = "none")) and 2) the legend alone (cowplot::get_legend()) (see code at the end of the reproducible example), but the problem is that during the extraction process of the legend a virtual plot is created, and the legend extracted will be cut depending on the rendering version used as can be seen below:

Legend extracted with cowplot::get_legend

I have already found a lot of material on the web to adapt the size of the legend by changing the text size and/or of other options in the legend, but they all require to do it manually for each figure and for each rendering option.

To avoid that, I am searching for another way to extract the whole legend (without any cut on the sides) before printing the plot in order to be able to combine it separately to the generated figure without legend, in order to adapt the size of the legend to the material it should be printed on.

Thanks in advance for your help!

GaryDe
  • 492
  • 1
  • 5
  • 17

1 Answers1

1

One approach could be using if-else logic to store values for sizes of different ggplot components in a list and then use that list in the ggplot object instead of hardcoded values. And we can use knitr::is_html_output and knitr::is_latex_output to determine the document output format. (For more output variant, Quarto's condtitional content may be used.

```{r}
if(knitr::is_html_output()) {
  size_list <- list(
    legend_text = 7,
    legend_title = 11,
    legend_space = 0.7,
    fig_height = 6,
    fig_width = 8
  )
} else if (knitr::is_latex_output()) {
  size_list <- list(
    legend_text = 7,
    legend_title = 10,
    legend_space = 0.5,
    fig_height = 6,
    fig_width = 8
  )
}
```

```{r}
#| fig-height: !expr size_list$fig_height
#| fig-width: !expr size_list$fig_width

fig <- fig +
  facet_wrap(vars(family_sober)) +
  scale_x_continuous(breaks = quartal_breaks$breaks,
                     labels = quartal_breaks$year,
                     minor_breaks = c(1:9)) +
  xlab("Time") +
  ylab("Number") +
  guides(color = guide_legend(override.aes = list(size = 0.8))) +
  theme(legend.position = "bottom",
        legend.text = element_markdown(size = size_list$legend_text),
        legend.key.height = unit(1.8, units = "char"),
        legend.margin = margin(t = 0, r = 0, b = 0, l = 0, unit='cm'),
        legend.spacing = unit(size_list$legend_space, units = "char"),
        legend.title = element_markdown(size = size_list$legend_title),
        axis.text.x=element_text(angle=45, hjust=1, size = 7))

print(fig)
```

(I have posted the last part of the code given in question, where necessary changes are done)

pdf output


pdf output


html output


html output


shafee
  • 15,566
  • 3
  • 19
  • 47
  • 3
    Thanks, it's a good workaround. However, it means that the right values must be found by trial and error for each plot and for each output format, which is time consuming. Additionally, if this is a report that it actualised after a given time period, you should each time check that it is still working, as adding a new category could lead back to the same problem. I will wait to see if somebody proposes another solution, and if not then I will accept your answer. – GaryDe Jan 13 '23 at 10:29