6

I want to create a plot (preferable using ggplot2) where I visualize a timeline together with a time-trend plot.

To put it in a practical example, I have aggregated unemployment rates for each year. I also have a data set denoting important legislation changes that are related to the labor market. Hence, I want to create a timeline where the unemployment rate is shown following the same x-axis (time).

I have generated some toy-data, see code below:

set.seed(2110)
year <- c(1950:2020)
unemployment <- rnorm(length(year), 0.05, 0.005)
un_emp <- data.frame(cbind(year, unemployment))


year <- c( 1957, 1961, 1975, 1976, 1983, 1985, 1995, 1999, 2011, 2018)
events <- c("Implemented unemployment benefit", 
            "Pre-school became free", 
            "Five-day workweek were introduced", 
            "Labor law reform 1976", 
            "Unemployment benefit were cut in half", 
            "Apprenticeship Act allows on-the-job training",
            "Changes in discrimination law",
            "Equal Pay for Equal Work was", 
            "9 weeks vacation were introduced",
            "Unemployment benefit were removed")

imp_event  <- data.frame(year, events)

I can easily plot the time-trend across the years:

library(tidyverse)
                      
ggplot(data = un_emp, aes(x = year, y = unemployment)) + 
  geom_line(color = "#FC4E07", size = 0.5) +
  theme_bw()

Time trend.

But how do I include the events (found in imp_event) in the plot in a nice and efficient way? How can I do this?

My aim is to make a timeline looking like the one from here but to combine it with the time-trend plot shown above. How can I do this?

enter image description here

I have tried to use vline but I cannot add the label of the event.

Thanks!

Waldi
  • 39,242
  • 6
  • 30
  • 78
ecl
  • 369
  • 1
  • 15

3 Answers3

6

I think this should do the trick:

First, I created the axis with hline, using the mean you set for the data as the y intercept. Then I added a variable "height" to the events' dataframe, which takes the value of the axis and adds a value drawn from a normal distribution. I used this to draw the segments that create the lines towards each point. Finally, I inverted the y position of the year label so it's always in the opposite side of the segment.

library(tidyverse)

set.seed(2110)
year <- c(1950:2020)
unemployment <- rnorm(length(year), 0.05, 0.005)
un_emp <- data.frame(cbind(year, unemployment))

year <- c( 1957, 1961, 1975, 1976, 1983, 1985, 1995, 1999, 2011, 2018)
events <- c("Implemented unemployment benefit", 
            "Pre-school became free", 
            "Five-day workweek were introduced", 
            "Labor law reform 1976", 
            "Unemployment benefit were cut in half", 
            "Apprenticeship Act allows on-the-job training",
            "Changes in discrimination law",
            "Equal Pay for Equal Work was", 
            "9 weeks vacation were introduced",
            "Unemployment benefit were removed")

imp_event  <- data.frame(year, events) %>% 
  mutate(height = mean(unemployment) + rnorm(n(), 0, 0.02))

    ggplot(un_emp) +
  
  geom_hline(yintercept = 0.05) +
  
  geom_line(aes(x = year,
                y = unemployment),
            color = "red",
            alpha = 0.3,
            size = 1) +
  
  geom_segment(data = imp_event,
               aes(x = year,
                   xend = year,
                   y = 0.05,
                   yend = height)) +
  
  geom_text(data = imp_event,
            aes(label = year, 
                x = year,
                y = 0.05 + 0.002 * sign(0.05 - height)), 
            angle = 90, 
            size = 3.5, 
            fontface = "bold",
            check_overlap = T) +
  
  geom_point(data = imp_event,
             aes(x = year,
                 y = height,
                 fill = as.factor(events)),
             shape = 21,
             size = 4) +
  
  scale_x_continuous(name = NULL, 
                     labels = NULL) +
  
  scale_fill_discrete(name = "Event") +
  
  scale_y_continuous(name = "Unemployment Rate") +
  
  theme_bw() + 
  
  theme(panel.border = element_blank(),
        axis.line.y  = element_line(),
        axis.ticks.x = element_blank(),
        panel.grid = element_blank(),
        legend.position="bottom")

enter image description here

Ramiro Reyes
  • 535
  • 2
  • 7
2

I worked with Jon Spring's solution but replaced geom_segment with geom_vline which gave a result close to what I wanted. The final code looked like this:


joined_data <- un_emp %>% left_join(imp_event, by = "year")

ggplot(data = joined_data, aes(x = year, y = unemployment)) + 
  geom_line(color = "red", size = 0.5) +

  theme_classic() +
  labs(y = "Unemployment rate", 
       x = "Years", 
       caption = "Data from XXXX") +
  geom_vline(data = joined_data %>% filter(!is.na(events)),  aes(xintercept = year), color = "gray70",  linetype = "dashed") +   
  ggrepel::geom_text_repel(data = joined_data, aes(x = year, y = unemployment-0.03, label = str_wrap(events, 10)), color = "gray70", direction = "y", size = 2.5, lineheight = 0.7, point.padding = 0.8)

Which produces the following plot: enter image description here

I want to reward @Jon Spring the bounty but not sure how I reward a comment.

ecl
  • 369
  • 1
  • 15
1

You can achieve this by overlaying a geom_text() call, but that requires the x and y values to be the same length as in the other plot so you can't just feed it a new df and overlay that.

Instead, you can achieve what you want by doing a left_join from un_emp to imp_events on year. Because there is only one row per year in imp_events you'll be left with a majority of missing values for events in the df which is perfect as I suspect you only want each event to appear as a label once.

For example:

joined_data <- un_emp %>% left_join(imp_event, by = "year")

ggplot(data = joined_data, aes(x = year, y = unemployment)) + 
  geom_line(color = "#FC4E07", size = 0.5) +
  geom_text(data = joined_data, aes(x = year, y = unemployment, label = (events), size = 3)) +
  theme_bw() 

Which gives you something like this:

enter image description here

You can have a look at the available options and play around with geom_text() here.

C.Robin
  • 1,085
  • 1
  • 10
  • 23
  • I was hoping for where the events are more related to the x-axis. Maybe with a line from the x-axis. And preferable a bit lower down. – ecl Jun 18 '21 at 09:04
  • Ok, this would take me ages as i'm really not very good in R yet. I suggest you edit your question to make it clearer exactly what you're hoping to get and someone else may pick it up (if my answer was at least partially helpful you can upvote it, even if you don't accept it as answering your question) – C.Robin Jun 18 '21 at 10:04
  • 3
    This is a good start and I'd suggest adding `geom_segment(data = joined_data %>% filter(!is.na(events)), aes(xend = year, yend = 0), color = "gray70") + ggrepel::geom_text_repel(data = joined_data, aes(x = year, y = unemployment-0.03, label = str_wrap(events, 12)), direction = "y", size = 3, lineheight = 0.7, point.padding = 0.8) +` to get lines connecting to the x axis and to place the text more compactly w/o overlaps. – Jon Spring Jun 29 '21 at 17:47