1

The code plots data with computed weekly regression lines. I would like to combine the legend with weekly doubling times, computed from the weekly slopes.

Nice to solve question: I could get the weekly regression lines with a geom_smooth. However, I could not extract the slope coefficient (to compute the doubling time) from the geom_smooth. I therefore had to do equivalent regressions outside the ggplot portion.
Any suggestions to do this more elegantly?

Main question: How can I combine the legend with the column of computed doubling times? With a lot of fiddling I can place the legend sort of next to these computed doubling times. It does not look nice and when I include another datapoint I have to start fiddling all over again. Suggestions will be appreciated. Thank you.

library(ggplot2)
library(gridExtra)

# Input data: Daily number of cases starting at day0 
cases <- c(1,1,2,3,7,10,13,16,24,38,51,62,85,116,150,202,240,274,402,554,709, 927) 
day0 <- as.Date("2020-03-04")

# actual dates by counting from day0
dates <- day0 + 1:length(cases) 

# week number as factor to obtain regression line for each week
week <- as.factor(1 + (1:length(cases) ) %/% 7)

# tibble with daily data, also with week number 
datatib <- tibble( dates, cases, week)

# tibble with computed doubling time per week
resulttib <- tibble(Week=unique(week), Doubling_Time=NA)

# linear regression on log of dependent variable
for (wk in unique(week) ) {
  resulttib[wk,'Doubling_Time'] <- 
    round( log(2) / lm(log(cases) ~ dates, data=datatib[week==wk,] )$coef['dates'], 2 )
}

# insert row at top for second line of column heading
resulttib <- add_row(resulttib, Week = '', Doubling_Time = '(days)', .before = 1) 

doublingtime = tableGrob(resulttib[,'Doubling_Time'], rows=NULL)

gp <- 
  ggplot(datatib, aes(dates, cases, color = week ) ) +
  geom_point() +
  geom_smooth( method = "lm", se = FALSE) +
  scale_x_date() +
  scale_y_continuous(trans="log10") +
  labs(x = "", y = "Number of Cases") +
  ggtitle("Number of Cases with Weekly Doubling Times") +
  theme(plot.title = element_text(hjust = 0.5)) +

  theme(legend.position=c(0.75,0), 
        legend.justification=c(1.2, -0.1), legend.text=element_text(size=14) ) +
  annotation_custom( doublingtime, 
      xmin=dates[length(cases)]-2, xmax=dates[length(cases)], ymin=-2.65 )

e

donman
  • 57
  • 3

1 Answers1

1

As an answer to your main question ... try this. I simply joined the doubling time to your main df and created a new var combining no. of week and doubling time. Color is then mapped on this new var.

Concerning your second question: There are ways to compute the slope from the computed values of geom_smooth/stat_smooth. However, in my opinion your approach of computing the slopes is the easier way to the kind of problem your are trying to solve.

library(ggplot2)
library(dplyr)
library(gridExtra)

# Input data: Daily number of cases starting at day0 
cases <- c(1,1,2,3,7,10,13,16,24,38,51,62,85,116,150,202,240,274,402,554,709, 927) 
day0 <- as.Date("2020-03-04")

# actual dates by counting from day0
dates <- day0 + 1:length(cases) 

# week number as factor to obtain regression line for each week
week <- as.factor(1 + (1:length(cases) ) %/% 7)

# tibble with daily data, also with week number 
datatib <- tibble( dates, cases, week)

# tibble with computed doubling time per week
resulttib <- tibble(Week=unique(week), Doubling_Time=NA)

# linear regression on log of dependent variable
for (wk in unique(week) ) {
  resulttib[wk,'Doubling_Time'] <- 
    round( log(2) / lm(log(cases) ~ dates, data=datatib[week==wk,] )$coef['dates'], 2 )
}

# insert row at top for second line of column heading
#resulttib <- add_row(resulttib, Week = '', Doubling_Time = '(days)', .before = 1) 

#doublingtime = tableGrob(resulttib[,'Doubling_Time'], rows=NULL)

datatib1 <- datatib %>% 
  left_join(resulttib, by = c("week" = "Week")) %>% 
  mutate(week1 = paste0(week, " (", Doubling_Time, ")"))

gp <- 
  ggplot(datatib1, aes(dates, cases, color = week1 ) ) +
  geom_point() +
  geom_smooth( method = "lm", se = FALSE) +
  scale_x_date() +
  scale_y_continuous(trans="log10") +
  labs(x = "", y = "Number of Cases") +
  ggtitle("Number of Cases with Weekly Doubling Times") +
  theme(plot.title = element_text(hjust = 0.5)) +
  theme(
    legend.position = c(.95, .05),
    legend.justification = c("right", "bottom"),
    legend.box.just = "right",
    legend.margin = margin(6, 6, 6, 6)
  ) +
  labs(color = "Week (Doubling time in days)")

gp

Created on 2020-03-27 by the reprex package (v0.3.0)

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Thank you Stefan (vielen dank) for this nice solution. – donman Mar 28 '20 at 07:33
  • My pleasure (Gerne (: ). If you want do me a favor: Mark the question as answered. Besides giving me some credit it shows others with a similar problem that the solution worked and removes the question from the queue of questions still waiting for an answer. – stefan Mar 28 '20 at 07:58