1

I have this boxplots graph, and then I am trying to plot 2 different points on top of it, of different colors, so that the user can see where they fall on the boxplot, plus compare them to each other. The problem is, i want the legend to show two dots with corresponding colors, and their IDs. Instead, the legend is showing the boxplot categories/colors. How can i override the legend to show only what i want? Here's my code:

library(datasets)
library(ggplot2)

data(airquality)
airquality$Month <- factor(airquality$Month,
                       labels = c("May", "Jun", "Jul", "Aug", "Sep"))
airquality$ID <-seq(1:nrow(airquality))
dataPoint <- airquality[11,]
dataPoint2 <- airquality[17,]
plt <- ggplot(airquality, aes(x = Month, y = Ozone, color = Month)) +
  geom_boxplot(show.legend=TRUE,outlier.shape = NA) +
  geom_point(data = dataPoint, color='darkblue', aes(x = Month, y =    Ozone), size = 3,show.legend=TRUE) +
  geom_point(data = dataPoint2, color='darkred', aes(x = Month, y = Ozone), size = 3,show.legend=TRUE) +
  theme(legend.position = "bottom")
plt

enter image description here

Claus Wilke
  • 16,992
  • 7
  • 53
  • 104
JustLearning
  • 180
  • 2
  • 12
  • 1
    Not related to question, but if your x-axis is continuous I would change palette from discrete to something nicer – pogibas Feb 05 '18 at 21:13

2 Answers2

2

I would accomplish this by mapping the points to a different aesthetic. If show.legend is set to FALSE in that case, then the legends will show separately. You could also map to point shape, or any other aesthetic. Alternatively, you could map the fill of the boxplot geom, and map the color of the point geom.

For example:

library(datasets)
library(ggplot2)

data(airquality)
airquality$Month <- factor(airquality$Month,
                       labels = c("May", "Jun", "Jul", "Aug", "Sep"))
airquality$ID <-seq(1:nrow(airquality))
points <- c(11, 17)
airquality$Points <- NA
airquality$Points[points] <- c("Point a", "Point b")
plt <- ggplot(airquality, aes(x = Month, y = Ozone, color = Month)) +
  geom_boxplot(outlier.shape = NA) +
  geom_point(data = airquality[!is.na(airquality$Points), ], 
  mapping=aes(x = Month, y = Ozone, fill = Points), size = 3, shape = 21, inherit.aes=FALSE) +
  theme(legend.position = "bottom")
plt

enter image description here

Claus Wilke
  • 16,992
  • 7
  • 53
  • 104
alan ocallaghan
  • 3,116
  • 17
  • 37
2

Easiest way to plot figure like this would be:

  • combine dataPoint datasets (using rbind). Like this you will only need to call one geom_point
  • For boxplot use fill instead of color
  • Define point colors using scale_color_manual

Code:

# Combine datasets
dataPoints <- rbind(dataPoint, dataPoint2)

# Plot
ggplot(airquality, aes(Month, Ozone, fill = Month)) +
    geom_boxplot(outlier.shape = NA) +
    geom_point(data = dataPoints, 
               aes(Month, Ozone, color = factor(ID)), 
               size = 3) +
    labs(color = "ID",
         fill = "Month") +
    scale_color_manual(values = c("darkblue", "darkred")) +
    theme(legend.position = "bottom")

Result:

enter image description here


PS: I wouldn't add palette for month (fill) as it's this information is already shown on x-axis (redundant information). To remove fill legend you can add guides(fill = FALSE).


Edit after OPs comment to use shape:

In case you want shapes instead of colors

ggplot(airquality, aes(Month, Ozone, fill = Month)) +
    geom_boxplot(outlier.shape = NA) +
    geom_point(data = dataPoints, 
               aes(Month, Ozone, shape = factor(ID)), 
               size = 3) +
    labs(shape = "ID",
         fill = "Month") +
    scale_shape_manual(values = c(15, 17)) +
    theme(legend.position = "bottom")
pogibas
  • 27,303
  • 19
  • 84
  • 117
  • Thank you! This worked for the test-case I did, but here's a problem. Original graph that I posed (above) had boxplots with lines of different color, and no fill (white fill in the middle). That's how I need them displayed. Now, if both geom_point and geom_boxplot use "color" for diff. colors, then i can't remove the second parth of the legend by "guides(fill = FALSE)". Any tips on how to work around that? – JustLearning Feb 08 '18 at 16:51
  • 1
    @AnyaR in `geom_point` do this: `geom_point(data = dataPoints, aes(aes(Month, Ozone, fill = factor(ID)), shape = 21)`. For main `ggplot` change `fill = Month` to `color = Month`. Here we change fill to color. Let me know how it goes. – pogibas Feb 08 '18 at 17:11
  • 1
    Thank you so much! This worked! I almost figured this out on my own, but without shape=21 , points were coming out black disregarding my fill instructions. This little detail really helped, thank you! :) – JustLearning Feb 08 '18 at 17:39
  • 1
    Any chance you could recommend me how I could make the two points of different shapes (while the same color?). I tried to do shape=c(15,17) - for square and triangle, but this doesn't seem to work. :( Again, the challenge for me to show these two different shapes in the legend - next to ID. Thank you in advance!!! – JustLearning Apr 06 '18 at 12:35
  • @AnyaR I edited my answer, please let me know if there's something else I can help you with :-) – pogibas Apr 06 '18 at 12:41