1

I am trying to make a figure that has an arrow pointing to the x-axis (perpendicular to the x-axis), but I need the arrow to be located outside of the plot area (pointing up at the x-axis), and I can't figure out how to make the arrow appear while also allowing the plot to fit nicely in the ggarrange figure that combines multiple similar plots. There's a lot going on in this figure, so I'm not sure what part is causing the problem. I am using ggarrange because it allowed me to make the axes of the plots line up, and I really want a solution that keeps the axes of my plots aligned.

Here is the plot that I want to create (I added the arrow in PowerPoint). What I want to make using R

I'm using the packages

library(tidyverse)
library(ggpubr)

Here is my reproducible example dataframe.

df <- tribble(~month, ~temperature, ~temp.datetime, ~people, ~people.datetime,
             "march", "70", "2018-03-25 9:00", "2", "2018-03-25 9:12",
             "march", "79", "2018-03-26 10:00", "1", "2018-03-26 10:12", 
             "march", "77", "2018-03-26 11:10", "9", "2018-03-26 11:12",
             "march", "75", "2018-03-26 12:00", "4", "2018-03-26 12:12",
             "march", "72", "2018-03-27 13:30", "5", "2018-03-27 13:12",
             "march", "71", "2018-03-28 14:00", "1", "2018-03-28 14:12",
             "april", "69", "2018-04-24 22:00", "4", "2018-04-24 22:12",
             "april", "81", "2018-04-25 0:00", "0", "2018-04-25 0:12",
             "april", "73", "2018-04-25 12:00", "0", "2018-04-25 12:12",
             "april", "70", "2018-04-26 1:00", "3", "2018-04-26 1:12",
             "april", "72", "2018-04-26 2:20", "8", "2018-04-26 2:12",
             "april", "75", "2018-04-26 3:00", "4", "2018-04-26 3:12",
             "april", "77", "2018-04-27 4:00", "2", "2018-04-27 4:12",
             "april", "75", "2018-04-28 5:13", "2", "2018-04-28 5:12",
             "may", "70", "2018-05-24 15:00", "1", "2018-05-24 14:12",
             "may", "79", "2018-05-26 16:00", "6", "2018-05-26 15:12",
             "may", "79", "2018-05-26 16:45", "2", "2018-05-26 16:12",
             "may", "75", "2018-05-26 17:00", "7", "2018-05-26 17:12",
             "may", "72", "2018-05-27 18:00", "2", "2018-05-27 18:12",
             "july", "75", "2018-07-23 12:00", "1", "2018-07-23 12:12",
             "july", "77", "2018-07-24 13:00", "2", "2018-07-24 13:12",
             "july", "81", "2018-07-25 14:00", "5", "2018-07-25 14:12",
             "july", "72", "2018-07-26 15:00", "2", "2018-07-26 15:12",
             "july", "75", "2018-07-26 16:10", "0", "2018-07-26 16:12",
             "july", "77", "2018-07-26 17:00", "2", "2018-07-26 17:12",
             "july", "75", "2018-07-27 18:20", "1", "2018-07-27 18:12")

First, I made all of the data have the right structure. Then I split this data frame into 4 subsets based on month (this makes more sense for my actual data).

df$temp.datetime <- as.POSIXct(df$temp.datetime)
df$people.datetime <- as.POSIXct(df$people.datetime)
df$temperature <- as.numeric(df$temperature)
df$people <- as.numeric(df$people)
df$month <- as.factor(df$month)
mar.df <- df %>% filter(month == "march")
apr.df <- df %>% filter(month == "april")
may.df <- df %>% filter(month == "may")
jul.df <- df %>% filter(month == "july")

Then, I made the same figure for each month of data. These have two y-axes because I'm plotting two sets of data that have different times that they were taken, so while the x-axis is the same for them, the points don't exactly line up. The y-axis for the "people" data is the same across all 4 month plots so that they can be easily compared against one another.

tempcolor <- "#EBC400"
peopcolor <- "#3b60e9"

mar.temp.peop.time <- ggplot(mar.df, aes())+
  geom_point(aes(x = people.datetime, y = people), fill = peopcolor, shape = 23, size = 3)+
  geom_point(aes(x = temp.datetime, y = temperature/10), fill = tempcolor, shape = 21, size = 3)+
  geom_line(aes(x = people.datetime, y = people), color = peopcolor)+
  geom_line(aes(x = temp.datetime, y = temperature/10), color = tempcolor)+
  scale_y_continuous(name = "Number of People in Room", limits = c(0, 10), sec.axis = sec_axis(trans = ~.*10, name = "Temperature of Room"))+
  xlab("Date")+
  theme_classic(base_size = 17)+
  theme(axis.title.y = element_text(color = peopcolor, face = "bold"), axis.title.y.right = element_text(color = tempcolor, face = "bold"))+
  theme(axis.title.x = element_blank())
mar.temp.peop.time

apr.temp.peop.time <- ggplot(apr.df, aes())+
  geom_point(aes(x = people.datetime, y = people), fill = peopcolor, shape = 23, size = 3)+
  geom_point(aes(x = temp.datetime, y = temperature/10), fill = tempcolor, shape = 21, size = 3)+
  geom_line(aes(x = people.datetime, y = people), color = peopcolor)+
  geom_line(aes(x = temp.datetime, y = temperature/10), color = tempcolor)+
  scale_y_continuous(name = "Number of People in Room", limits = c(0, 10), sec.axis = sec_axis(trans = ~.*10, name = "Temperature of Room"))+
  xlab("Date")+
  theme_classic(base_size = 17)+
  theme(axis.title.y = element_text(color = peopcolor, face = "bold"), axis.title.y.right = element_text(color = tempcolor, face = "bold"))+
  theme(axis.title.x = element_blank())
apr.temp.peop.time

may.temp.peop.time <- ggplot(may.df, aes())+
  geom_point(aes(x = people.datetime, y = people), fill = peopcolor, shape = 23, size = 3)+
  geom_point(aes(x = temp.datetime, y = temperature/10), fill = tempcolor, shape = 21, size = 3)+
  geom_line(aes(x = people.datetime, y = people), color = peopcolor)+
  geom_line(aes(x = temp.datetime, y = temperature/10), color = tempcolor)+
  scale_y_continuous(name = "Number of People in Room", limits = c(0, 10), sec.axis = sec_axis(trans = ~.*10, name = "Temperature of Room"))+
  xlab("Date")+
  theme_classic(base_size = 17)+
  theme(axis.title.y = element_text(color = peopcolor, face = "bold"), axis.title.y.right = element_text(color = tempcolor, face = "bold"))
may.temp.peop.time

jul.temp.peop.time <- ggplot(jul.df, aes())+
  geom_point(aes(x = people.datetime, y = people), fill = peopcolor, shape = 23, size = 3)+
  geom_point(aes(x = temp.datetime, y = temperature/10), fill = tempcolor, shape = 21, size = 3)+
  geom_line(aes(x = people.datetime, y = people), color = peopcolor)+
  geom_line(aes(x = temp.datetime, y = temperature/10), color = tempcolor)+
  scale_y_continuous(name = "Number of People in Room", limits = c(0, 10), sec.axis = sec_axis(trans = ~.*10, name = "Temperature of Room"))+
  xlab("Date")+
  theme_classic(base_size = 17)+
  theme(axis.title.y = element_text(color = peopcolor, face = "bold"), axis.title.y.right = element_text(color = tempcolor, face = "bold"))
jul.temp.peop.time

Then, I used ggarrange to combine the 4 plots into one with the axes aligned.

eachmonth <- ggarrange(
  mar.temp.peop.time, apr.temp.peop.time, may.temp.peop.time, jul.temp.peop.time,
  ncol = 2, nrow = 2,
  labels = c("A", "B", "C", "D"),
  label.x = .03, font.label = list(size = 25), 
  align = "v", heights = c(5,5,5,5)
)
eachmonth

It'd be really nice if the arrow avoided the labels on the x-axis, because where I want to put the arrow (it marks a specific point in time) actually overlaps a little with one of the date tick labels. Also, if there are any parts of this code that could be cleaned up or done differently, please let me know! I'm still learning.

I've tried looking for an answer on this site, but none seem to be exactly what I'm looking for. This here seems close to the solution, but I can't get the arrow to show up (even on the plot before the ggarrange). I also tried using annotate() like below, but again, I can't get the arrow to show up.

annotate(geom = "segment", x = as.POSIXct("2018-07-24 12:30"), y = 0, xend = as.POSIXct("2018-07-24 12:30"), yend = -3.8, arrow = arrow(length = unit(2, "mm")), color = "red")
Bizzy
  • 15
  • 2
  • Where do you want to put the arrow? on all plots (e.g. four times) or only on the combine plot? – TarJae May 19 '21 at 18:33
  • Just once, for the July plot (like in the picture attached). The way @stefan did it is what I wanted! – Bizzy May 20 '21 at 18:31

1 Answers1

0

Maybe this fits your needs.

  1. Instead of making use of ggpubr::ggarrange I make use of patchwork as it does a great job in aligning plots and additionally works easy with lists via patchwork::wrap_plots().

  2. You could simplify your code by splitting your data into a list via e.g. split() and apply your plotting code to each of the list elements using e.g. lapply. To this end I make use of helper function which contains your plotting code. Note: To set the right order I make use of forcats::fct_inorder

  3. The issue with your red arrow not appearing even with the single plot is related to the fact that you set the limits of your y-scale to c(0, 10) which has the side effect of removing all data which does not fit into the range of the limits, i.e. your red arrow gets dropped as it starts at -3.8. This can be avoided by setting the limits via coord_cartesian. Additionally I set clip=off in the coords to avoid that the arrow is clipped off when hitting the plot margins.

  4. To switch the direction of the arrow simply switch y and yend in annotate.

  5. For the issue with the overplotting of the tick labels I would suggest to reduce the base font size as I did.

library(tibble)
library(ggplot2)
library(patchwork)

df$month <- forcats::fct_inorder(df$month)

df_list <- split(df, df$month)

tempcolor <- "#EBC400"
peopcolor <- "#3b60e9"

plot_fun <- function(x) {
  ggplot(x, aes())+
    geom_point(aes(x = people.datetime, y = people), fill = peopcolor, shape = 23, size = 3)+
    geom_point(aes(x = temp.datetime, y = temperature/10), fill = tempcolor, shape = 21, size = 3)+
    geom_line(aes(x = people.datetime, y = people), color = peopcolor)+
    geom_line(aes(x = temp.datetime, y = temperature/10), color = tempcolor)+
    scale_y_continuous(name = "Number of People in Room", sec.axis = sec_axis(trans = ~.*10, name = "Temperature of Room"), expand = c(0, 0)) +
    coord_cartesian(ylim = c(0, 10), clip = "off") +
    xlab("Date")+
    theme_classic(base_size = 10)+
    theme(axis.title.y = element_text(color = peopcolor, face = "bold"), axis.title.y.right = element_text(color = tempcolor, face = "bold"),
          axis.title.x = element_blank())
}

plots <- lapply(df_list, plot_fun)

plots[["july"]] <- plots[["july"]] + annotate(geom = "segment", x = as.POSIXct("2018-07-24 12:30"), 
                      y = -3.8, xend = as.POSIXct("2018-07-24 12:30"), yend = 0, 
                      arrow = arrow(length = unit(2, "mm")), color = "red")

eachmonth <- wrap_plots(plots) & plot_annotation(tag_levels = "A") 

eachmonth

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Thanks! I appreciate your explanation of using split, but for my actual figure I have different scales for the second y-axes, so I don't think I'll be able to use that here (but it's nice to know for the future). Using steps 3 and 4 from your answer I got my plot to work on `ggarrange` still and didn't have to change my font size. I had to tweak this a little -- I added `expand = expansion(mult = c(0.05, 0.02))` rather than `expand = c(0, 0)` so that the points didn't overlap with the x-axis. Could you explain how `coord_cartesian` is different than the scale I had set in `scale_y_continuous`? – Bizzy May 20 '21 at 16:27
  • 1
    Hi @Bizzy. Making use of `ggarrange` is absolutely fine. Should have mentioned that `patchwork` is not necessary. Concerning your question. When you set the limits via the scale all adjustments take place before the plot is drawn, i.e. everything which does not fit into the limits will be removed from the plot. In some sense that's similar to filtering the data before drawing the plot. In contrast, the limits you set via coord are applied **after** the plot is drawn which means that nothing is removed. Only the range of the data displayed is adjusted. – stefan May 20 '21 at 18:00