6

I am new to ggplot library. And trying to draw the plot using the following data.frame:

library(tidyverse)

df <-
  tribble(~event, ~startdate,~enddate,~loc,
      "A",as.POSIXct("1984/02/10"),as.POSIXct("1987/06/10"),"1",
      "B",as.POSIXct("1984/02/11"),as.POSIXct("1990/02/12"),"2",
      "A",as.POSIXct("1992/05/15"),as.POSIXct("1999/06/15"),"3",
      "C",as.POSIXct("2003/08/29"),as.POSIXct("2015/08/29"),"4",
      "B",as.POSIXct("2002/04/11"),as.POSIXct("2012/04/12"),"5",
      "E",as.POSIXct("2000/02/10"),as.POSIXct("2005/02/15"),"6",
      "A",as.POSIXct("1985/02/10"),as.POSIXct("1987/06/10"),"7",
      "B",as.POSIXct("1989/02/11"),as.POSIXct("1990/02/12"),"8",
      "A",as.POSIXct("1997/05/15"),as.POSIXct("1999/06/15"),"9",
      "C",as.POSIXct("2010/08/29"),as.POSIXct("2015/08/29"),"10",
      "B",as.POSIXct("2010/04/11"),as.POSIXct("2012/04/12"),"11",
      "E",as.POSIXct("2004/02/10"),as.POSIXct("2005/02/15"),"12")
max_date = max(df$startdate,df$enddate)

Using the following code snippet:

ggplot(df)+
  geom_segment(aes(y=loc, yend = loc, x = startdate, xend = enddate, colour=event),size = 5,alpha=0.6) +
  geom_label(aes(label=event, y = loc, x=max_date), size=2) +
  xlab("Year") + ylab("LoC") +
  scale_x_datetime(date_breaks = "year", date_labels = "%Y") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5)) +
  facet_grid(rows = vars(event), scales = "free")

It generates the following plot: enter image description here

I would like to order the LoC axis i.e primary Y-axis within the label i.e. event (secondary Y-axis). Within each label, it should print a point on the primary Y-axis in ascending order (For event C (4 -> 10) and E(6 -> 10) are already in ascending order). How can I order the other points in ascending order within each label i.e. event (I tried to sort data using order and fct_inorder from forcats library but couldn't able to achieve the desired result as shown below)? How can I print a single label within each group? enter image description here

Please feel free to correct me!

Any help would be great! Thank you!

Arthur Yip
  • 5,810
  • 2
  • 31
  • 50
Saurabh Chauhan
  • 3,161
  • 2
  • 19
  • 46

2 Answers2

3

You were on the right track with the package forcats. fct_inseq for numeric order, and fct_rev to flip it depending if you want small->large or opposite. I also added a fct_reorder2 for your events (ordered by loc and enddate) but you might want to keep it in alphabetical order.

About your labels, it's probably best to create a new data frame to set where the labels are. I've done this below with group_by and summarize, and with other options in the comments below.

library(tidyverse, quietly = TRUE, warn.conflicts = FALSE)
#> Warning: package 'ggplot2' was built under R version 3.6.3
#> Warning: package 'tibble' was built under R version 3.6.3
#> Warning: package 'tidyr' was built under R version 3.6.3
#> Warning: package 'purrr' was built under R version 3.6.3
#> Warning: package 'dplyr' was built under R version 3.6.3
#> Warning: package 'forcats' was built under R version 3.6.3
df <- tribble(~event, ~startdate,~enddate,~loc,
          "A",as.POSIXct("1984/02/10"),as.POSIXct("1987/06/10"),"1",
          "B",as.POSIXct("1984/02/11"),as.POSIXct("1990/02/12"),"2",
          "A",as.POSIXct("1992/05/15"),as.POSIXct("1999/06/15"),"3",
          "C",as.POSIXct("2003/08/29"),as.POSIXct("2015/08/29"),"4",
          "B",as.POSIXct("2002/04/11"),as.POSIXct("2012/04/12"),"5",
          "E",as.POSIXct("2000/02/10"),as.POSIXct("2005/02/15"),"6",
          "A",as.POSIXct("1985/02/10"),as.POSIXct("1987/06/10"),"7",
          "B",as.POSIXct("1989/02/11"),as.POSIXct("1990/02/12"),"8",
          "A",as.POSIXct("1997/05/15"),as.POSIXct("1999/06/15"),"9",
          "C",as.POSIXct("2010/08/29"),as.POSIXct("2015/08/29"),"10",
          "B",as.POSIXct("2010/04/11"),as.POSIXct("2012/04/12"),"11",
          "E",as.POSIXct("2004/02/10"),as.POSIXct("2005/02/15"),"12") %>%
  mutate(event = event %>% fct_reorder2(enddate, loc),
         loc = loc %>% fct_inseq() %>% fct_rev())
max_date = max(df$startdate,df$enddate)

df_for_labels <- df %>% group_by(event) %>% 
  summarize(date = max_date, # max(enddate) would give you the label at each event's max(enddate)
            loc = n()) # 1 for bottom-right. n() for top-right (counts number of rows in each event)
#> `summarise()` ungrouping output (override with `.groups` argument)

ggplot(df)+
  geom_segment(aes(y=loc, yend = loc, x = startdate, xend = enddate, colour=event),size = 5,alpha=0.6) +
  geom_label(data = df_for_labels, aes(label=event, y = loc, x=date), size=2) +
  xlab("Year") + ylab("LoC") +
  scale_x_datetime(date_breaks = "year", date_labels = "%Y") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5)) +
  facet_grid(rows = vars(event), scales = "free")

Created on 2020-10-21 by the reprex package (v0.3.0)

Arthur Yip
  • 5,810
  • 2
  • 31
  • 50
1

As df$loc is a character, when ggplot transforms it to factor to use it as a color argument, it selects it's levels by alphabetical order:

df$loc = as.factor(df$loc)

Which puts "10" in front of "2". But you want it in numeric order, so you need to transform it to numeric by doing:

df$loc = factor(as.numeric(df$loc), levels=12:1)

The levels=12:1 is to rearrange the order of your levels, and make the lower values appear before the big ones.

This solves the order. The labels i don't know how to solve.