1

I want to overlay a line plot over a box plot with custom colors.

I feel almost certain that I've done this before, but I can't find a solution right now. Any takers on this question? Thanks in advance.

The box plot

This code:

  ggplot(plotdata) +
    geom_boxplot(data=resp_daily_ca_ranked,
                 aes(x=as.factor(week), y=pct, fill = rankgroup)) +
    scale_fill_manual(values=COLS) +
    # NOT RUN: geom_line(aes(week, pct, colour=Group)) +
    theme_minimal() + xlab("") + ylab(YLAB) +
    scale_color_manual(values=COLS) +
    scale_y_continuous(labels = scales::percent) +
    ggtitle(label = TITLE, subtitle = SUBTITLE)

produces the desired boxplot: boxplot_alone

The line plot

This code:

  ggplot(plotdata) +
    # NOT RUN: geom_boxplot(data=resp_daily_ca_ranked,
    # NOT RUN:              aes(x=as.factor(week), y=pct, fill = rankgroup)) +
    scale_fill_manual(values=COLS) +
    geom_line(aes(week, pct, colour=Group)) +
    theme_minimal() + xlab("") + ylab(YLAB) +
    scale_color_manual(values=COLS) +
    scale_y_continuous(labels = scales::percent) +
    ggtitle(label = TITLE, subtitle = SUBTITLE)

produces the desired line plot:

line_plot_alone

The error

However this code, with the desired components:

  ggplot(plotdata) +
    geom_boxplot(data=resp_daily_ca_ranked,
                 aes(x=as.factor(week), y=pct, fill = rankgroup)) +
    scale_fill_manual(values=COLS) +
    geom_line(aes(week, pct, colour=Group)) +
    theme_minimal() + xlab("") + ylab(YLAB) +
    scale_color_manual(values=COLS) +
    scale_y_continuous(labels = scales::percent) +
    ggtitle(label = TITLE, subtitle = SUBTITLE)

produces this error:

Error: `mapped_discrete` objects can only be created from numeric vectors
Run `rlang::last_error()` to see where the error occurred.

The naive approach

This code:

ggplot(plotdata) +
  geom_boxplot(data=resp_daily_ca_ranked,
               aes(x=week, y=pct, fill = rankgroup)) + # <-- removed factor
  scale_fill_manual(values=COLS) +
  geom_line(aes(week, pct, colour=Group)) +
  theme_minimal() + xlab("") + ylab(YLAB) +
  scale_color_manual(values=COLS) +
  scale_y_continuous(labels = scales::percent) +
  ggtitle(label = TITLE, subtitle = SUBTITLE)

Produces this plot (which is not desired)

not_what_I_want_at_all

Code to create example:

require(data.table)
require(ggplot2)

plotdata <- structure(
  list(structure(
    c(18355L, 18355L, 18355L, 18355L, 18355L, 18355L, 18355L, 18362L, 18362L, 18362L, 18362L, 
      18362L, 18362L, 18362L, 18369L, 18369L, 18369L, 18369L, 18369L, 18369L, 18369L, 18376L, 
      18376L, 18376L, 18376L, 18376L, 18376L, 18376L, 18383L, 18383L, 18383L, 18383L, 18383L, 
      18383L, 18383L, 18390L, 18390L, 18390L, 18390L, 18390L, 18390L, 18390L, 18397L, 18397L, 
      18397L, 18397L, 18397L, 18397L, 18397L, 18404L, 18404L, 18404L, 18404L, 18404L, 18404L, 
      18404L, 18411L, 18411L, 18411L, 18411L, 18411L, 18411L, 18411L, 18418L, 18418L, 18418L, 
      18418L, 18418L, 18418L, 18418L, 18425L, 18425L, 18425L, 18425L, 18425L, 18425L, 18425L, 
      18432L, 18432L, 18432L, 18432L, 18432L, 18432L, 18432L, 18439L, 18439L, 18439L, 18439L, 
      18439L, 18439L, 18439L, 18446L, 18446L, 18446L, 18446L, 18446L, 18446L, 18446L, 18453L, 
      18453L, 18453L, 18453L, 18453L, 18453L, 18453L, 18460L, 18460L, 18460L, 18460L, 18460L, 
      18460L, 18460L, 18467L, 18467L, 18467L, 18467L, 18467L, 18467L, 18467L),
    class = c('IDate', 'Date')),
    c(0.1649, 0.1571, 0.1402, 0.1288, 0.1179, 0.1011, 0.0826, 0.0465, 0.0435, 0.0401, 0.0388, 
      0.0363, 0.0334, 0.0288, 0.0282, 0.0263, 0.0266, 0.0251, 0.0229, 0.0224, 0.0186, 0.0431, 
      0.0322, 0.0325, 0.0253, 0.025, 0.0214, 0.0155, 0.0453, 0.0301, 0.0322, 0.0233, 0.0238, 
      0.0189, 0.0149, 0.026, 0.0218, 0.0231, 0.0189, 0.0188, 0.019, 0.0163, 0.0126, 0.0121, 
      0.013, 0.0119, 0.0109, 0.015, 0.0138, 0.007, 0.0071, 0.0073, 0.0066, 0.0068, 0.0074, 
      0.0069, 0.0045, 0.0049, 0.0051, 0.0047, 0.0048, 0.0048, 0.0046, 0.0039, 0.0039, 0.0042, 
      0.0039, 0.0036, 0.0034, 0.0031, 0.0035, 0.0033, 0.0031, 0.0033, 0.0031, 0.0036, 0.0032, 
      0.0025, 0.003, 0.0028, 0.0031, 0.0031, 0.0029, 0.0027, 0.0023, 0.0025, 0.0023, 0.0026, 
      0.0024, 0.0028, 0.0025, 0.002, 0.0021, 0.0021, 0.002, 0.0024, 0.0027, 0.003, 0.0017, 
      0.0019, 0.0023, 0.0022, 0.0022, 0.0022, 0.0023, 0.0025, 0.0026, 0.0026, 0.0025, 
      0.0024, 0.0023, 0.0024, 0.0027, 0.0032, 0.0029, 0.0028, 0.0024, 0.0034, 0.0033),
    c('1', '2', '3', '4', '5', '6', '7', '1', '2', '3', '4', '5', '6', '7', '1', '2', 
      '3', '4', '5', '6', '7', '1', '2', '3', '4', '5', '6', '7', '1', '2', '3', '4', 
      '5', '6', '7', '1', '2', '3', '4', '5', '6', '7', '1', '2', '3', '4', '5', '6', 
      '7', '1', '2', '3', '4', '5', '6', '7', '1', '2', '3', '4', '5', '6', '7', '1', 
      '2', '3', '4', '5', '6', '7', '1', '2', '3', '4', '5', '6', '7', '1', '2', '3', 
      '4', '5', '6', '7', '1', '2', '3', '4', '5', '6', '7', '1', '2', '3', '4', '5', 
      '6', '7', '1', '2', '3', '4', '5', '6', '7', '1', '2', '3', '4', '5', '6', '7', 
      '1', '2', '3', '4', '5', '6', '7')),
  .Names = c('week', 'pct', 'Group'),
  row.names = c(NA, -119L),
  class = c('data.table', 'data.frame'))
resp_daily_ca_ranked <- structure(list(structure(
  c(18355L, 18355L, 18355L, 18362L, 18362L, 18362L, 18369L, 18369L, 18369L, 18376L, 18376L, 18376L, 
    18383L, 18383L, 18383L, 18390L, 18390L, 18390L, 18397L, 18397L, 18397L, 18404L, 18404L, 18404L, 
    18411L, 18411L, 18411L, 18418L, 18418L, 18418L, 18425L, 18425L, 18425L, 18432L, 18432L, 18432L, 
    18439L, 18439L, 18439L, 18446L, 18446L, 18446L, 18453L, 18453L, 18453L, 18460L, 18460L, 18460L, 
    18467L, 18467L, 18467L, 18355L, 18355L, 18355L, 18362L, 18362L, 18362L, 18369L, 18369L, 18369L, 
    18376L, 18376L, 18376L, 18383L, 18383L, 18383L, 18390L, 18390L, 18390L, 18397L, 18397L, 18397L, 
    18404L, 18404L, 18404L, 18411L, 18411L, 18411L, 18418L, 18418L, 18418L, 18425L, 18425L, 18425L, 
    18432L, 18432L, 18432L, 18439L, 18439L, 18439L, 18446L, 18446L, 18446L, 18453L, 18453L, 18453L, 
    18460L, 18460L, 18460L, 18467L, 18467L, 18467L, 18355L, 18355L, 18355L, 18362L, 18362L, 18362L, 
    18369L, 18369L, 18369L, 18376L, 18376L, 18376L, 18383L, 18383L, 18383L, 18390L, 18390L, 18390L, 
    18397L, 18397L, 18397L, 18404L, 18404L, 18404L, 18411L, 18411L, 18411L, 18418L, 18418L, 18418L, 
    18425L, 18425L, 18425L, 18432L, 18432L, 18432L, 18439L, 18439L, 18439L, 18446L, 18446L, 18446L, 
    18453L, 18453L, 18453L, 18460L, 18460L, 18460L, 18467L, 18467L, 18467L, 18355L, 18355L, 18355L, 
    18362L, 18362L, 18362L, 18369L, 18369L, 18369L, 18376L, 18376L, 18376L, 18383L, 18383L, 18383L, 
    18390L, 18390L, 18390L, 18397L, 18397L, 18397L, 18404L, 18404L, 18404L, 18411L, 18411L, 18411L, 
    18418L, 18418L, 18418L, 18425L, 18425L, 18425L, 18432L, 18432L, 18432L, 18439L, 18439L, 18439L, 
    18446L, 18446L, 18446L, 18453L, 18453L, 18453L, 18460L, 18460L, 18460L, 18467L, 18467L, 18467L, 
    18355L, 18355L, 18355L, 18362L, 18362L, 18362L, 18369L, 18369L, 18369L, 18376L, 18376L, 18376L, 
    18383L, 18383L, 18383L, 18390L, 18390L, 18390L, 18397L, 18397L, 18397L, 18404L, 18404L, 18404L, 
    18411L, 18411L, 18411L, 18418L, 18418L, 18418L, 18425L, 18425L, 18425L, 18432L, 18432L, 18432L, 
    18439L, 18439L, 18439L, 18446L, 18446L, 18446L, 18453L, 18453L, 18453L, 18460L, 18460L, 18460L, 
    18467L, 18467L, 18467L, 18355L, 18355L, 18355L, 18362L, 18362L, 18362L, 18369L, 18369L, 18369L, 
    18376L, 18376L, 18376L, 18383L, 18383L, 18383L, 18390L, 18390L, 18390L, 18397L, 18397L, 18397L, 
    18404L, 18404L, 18404L, 18411L, 18411L, 18411L, 18418L, 18418L, 18418L, 18425L, 18425L, 18425L, 
    18432L, 18432L, 18432L, 18439L, 18439L, 18439L, 18446L, 18446L, 18446L, 18453L, 18453L, 18453L, 
    18460L, 18460L, 18460L, 18467L, 18467L, 18467L, 18355L, 18355L, 18355L, 18362L, 18362L, 18362L, 
    18369L, 18369L, 18369L, 18376L, 18376L, 18376L, 18383L, 18383L, 18383L, 18390L, 18390L, 18390L, 
    18397L, 18397L, 18397L, 18404L, 18404L, 18404L, 18411L, 18411L, 18411L, 18418L, 18418L, 18418L, 
    18425L, 18425L, 18425L, 18432L, 18432L, 18432L, 18439L, 18439L, 18439L, 18446L, 18446L, 18446L, 
    18453L, 18453L, 18453L, 18460L, 18460L, 18460L, 18467L, 18467L, 18467L),
  class = c('IDate', 'Date')),
  c('4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', 
    '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', 
    '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '5', '5', '5', '5', '5', '5', 
    '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', 
    '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', 
    '5', '5', '5', '5', '5', '5', '5', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', 
    '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', 
    '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', 
    '1', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', 
    '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', 
    '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '2', '2', '2', '2', '2', 
    '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', 
    '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', 
    '2', '2', '2', '2', '2', '2', '2', '2', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', 
    '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', 
    '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', 
    '3', '3', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', 
    '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', 
    '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7'),
  c(0.1277, 0.1051, 0.1096, 0.0341, 0.0329, 0.0382, 0.031, 0.0184, 0.0176, 0.0392, 0.022, 0.0124, 
    0.0306, 0.0186, 0.0077, 0.0188, 0.0228, 0.0197, 0.0087, 0.0157, 0.0181, 0.0063, 0.0055, 0.0082,
    0.0037, 0.0051, 0.0055, 0.004, 0.0025, 0.0043, 0.0023, 0.0024, 0.0046, 0.0022, 0.0029, 0.0038,
    0.0027, 0.0024, 0.0028, 0.0024, 0.0028, 0.002, 0.0025, 0.002, 0.0019, 0.0023, 0.0015, 0.0016, 
    0.0027, 0.0036, 0.0028, 0.1271, 0.1181, 0.1218, 0.04, 0.0398, 0.0353, 0.0384, 0.0244, 0.0168, 
    0.0249, 0.0251, 0.0333, 0.0185, 0.034, 0.0263, 0.0254, 0.0213, 0.0191, 0.0186, 0.0095, 0.0107,
    0.0064, 0.0066, 0.0048, 0.0051, 0.0047, 0.0055, 0.0044, 0.0029, 0.0035, 0.0025, 0.0024, 0.0025,
    0.0037, 0.0022, 0.0022, 0.0041, 0.0031, 0.0016, 0.0026, 0.0025, 0.0029, 0.004, 0.002, 0.0016, 
    0.0037, 0.0037, 0.0015, 0.0031, 0.0024, 0.0029, 0.1671, 0.1859, 0.1612, 0.0471, 0.0549, 0.043,
    0.0367, 0.0315, 0.0299, 0.0444, 0.0269, 0.047, 0.0355, 0.0448, 0.0518, 0.0237, 0.0251, 0.0275, 
    0.0144, 0.0126, 0.0128, 0.0073, 0.0084, 0.0071, 0.005, 0.0065, 0.004, 0.005, 0.0046, 0.0036, 
    0.005, 0.0057, 0.003, 0.003, 0.0022, 0.002, 0.0022, 0.0028, 0.002, 0.0025, 0.0026, 0.0014,
    0.0021, 0.0029, 0.0011, 0.0031, 0.0037, 0.0016, 0.0038, 0.0035, 0.0023, 0.1047, 0.1079, 0.09, 
    0.0332, 0.0354, 0.0328, 0.0195, 0.0345, 0.0282, 0.016, 0.0333, 0.0343, 0.012, 0.0362, 0.0275, 
    0.0189, 0.0205, 0.0195, 0.016, 0.0124, 0.0116, 0.0073, 0.0064, 0.0063, 0.0044, 0.0052, 0.0046, 
    0.0037, 0.0037, 0.0032, 0.0039, 0.0039, 0.0032, 0.0021, 0.0033, 0.0033, 0.0029, 0.0033, 0.0032, 
    0.0026, 0.0023, 0.0028, 0.002, 0.0023, 0.0023, 0.0021, 0.0031, 0.0028, 0.0031, 0.0039, 0.0034, 
    0.1311, 0.1591, 0.1527, 0.0482, 0.0451, 0.0413, 0.0219, 0.0258, 0.0286, 0.014, 0.0223, 0.0631, 
    0.0132, 0.0178, 0.0395, 0.0198, 0.0209, 0.0202, 0.0186, 0.0197, 0.0099, 0.0093, 0.0103, 0.0052, 
    0.0061, 0.0065, 0.0038, 0.005, 0.0048, 0.0026, 0.005, 0.0059, 0.0028, 0.0054, 0.0031, 0.0026, 
    0.003, 0.0034, 0.0015, 0.0034, 0.0024, 0.0028, 0.0027, 0.0026, 0.0016, 0.0036, 0.0021, 0.0026, 
    0.0031, 0.0028, 0.0029, 0.1301, 0.1439, 0.1352, 0.0382, 0.0416, 0.0413, 0.0406, 0.0197, 0.028,
    0.0303, 0.0241, 0.0339, 0.0219, 0.0281, 0.0365, 0.024, 0.0207, 0.0223, 0.02, 0.024, 0.0121, 
    0.0092, 0.0094, 0.0064, 0.0072, 0.0051, 0.0054, 0.0034, 0.0026, 0.0035, 0.0023, 0.002, 0.0024, 
    0.0032, 0.0024, 0.0032, 0.0028, 0, 0.0013, 0.0038, 0.001, 0.0025, 0.0016, 0.002, 0.0024, 0.0024,
    0.001, 0.0022, 0.003, 0.004, 0.0037, 0.0668, 0.0612, 0.0969, 0.0247, 0.032, 0.034, 0.0135, 
    0.0109, 0.0208, 0.0059, 0.007, 0.0173, 0.0062, 0.0046, 0.0185, 0.0119, 0.0196, 0.0182, 0.0125, 
    0.021, 0.0151, 0.0062, 0.0066, 0.0072, 0.0043, 0.007, 0.0053, 0.0018, 0.005, 0.0028, 0.003, 
    0.0031, 0.0032, 0.0023, 0.0014, 0.0032, 0.002, 0.001, 0.0027, 0.0031, 0.0014, 0.0028, 0.0019,
    0.0013, 0.0021, 0.0012, 0, 0.0033, 0.0024, 0.003, 0.0034)),
  row.names = c(NA, -357L),
  class = c('data.table', 'data.frame'),
  .Names = c('week', 'rankgroup', 'pct'))


TITLE <- "Incremental Response Rate by Community Area (grouped)"
SUBTITLE <- sprintf("Weeks ending %s to %s",
                    format(min(plotdata$week), "%m/%d/%y"), 
                    format(max(plotdata$week), "%m/%d/%y"))
YLAB <- "Incremental Percent Responding"
COLS <- structure(c('#F3771AFF', '#D84C3EFF', '#AE305CFF', '#7F1E6CFF', '#500E6CFF', 
                    '#1E0C44FF', '#000004FF'),
                  .Names = c('1', '2', '3', '4', '5', '6', '7'))
geneorama
  • 3,620
  • 4
  • 30
  • 41

2 Answers2

0

The code below is pratically the same as the question's code. Perhaps the main difference is to explicitly assign the data argument in the layers boxplot and line.

ggplot(plotdata) +
  geom_boxplot(data = resp_daily_ca_ranked,
               mapping = aes(week, pct, fill = rankgroup)) +
  geom_line(data = plotdata, 
            mapping = aes(week, pct, color = Group), show.legend = FALSE) +
  scale_fill_manual(values = COLS) +
  scale_color_manual(values = COLS) +
  scale_y_continuous(labels = scales::percent) +
  xlab("") + ylab(YLAB) +
  ggtitle(label = TITLE, subtitle = SUBTITLE) +
  theme_minimal()

enter image description here

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • This is what I did in the "naive" approach. I'm looking for the box plots broken up by group within the time period. – geneorama Jul 31 '20 at 04:37
0

I found the answer today.

Frist, I needed to remember that aes requires a "group" argument (and I literally mean remember because only x and y are documented inputs for aes).

Second it's necessary to use interaction within boxplot to capture the time and group.

Third, I also combined the grouped and ungrouped data, then subset the data within the geom calls. source

combodata <- rbind(plotdata[,list(week, pct, Group, subset="line")],
                   resp_daily_ca_ranked[ , list(week, pct, Group=rankgroup,
                                                subset="boxplot")])
ggplot(combodata, aes(x=week, y=pct, fill=Group)) +
  geom_boxplot(aes(group=interaction(week, Group)),
               data = function(x){x[subset=="boxplot"]})+
  geom_line(aes(color=Group), data = function(x){x[subset=="line"]}) +
  theme_minimal() + xlab("") + ylab(YLAB) +
  scale_color_manual(values=COLS) +
  scale_fill_manual(values=COLS) +
  scale_y_continuous(labels = scales::percent) +
  ggtitle(label = TITLE, subtitle = SUBTITLE)

I'm not sure if it was necessary to combine the data, but it's nice anyway because of the way I'm going to handle the data.

enter image description here

And, by the way, the reason for all this is to find out if there are meaningful deviations within the aggregated data.

Looking at this one time period for example: enter image description here

Is interesting... why did one of the lowest performers have a spike? Is it related to anything else? This gives me a sense of where to look in the data.

geneorama
  • 3,620
  • 4
  • 30
  • 41