10

In my plot below, I have two separate sources of data (dat and dat2) used in two different geom_smooth() calls producing the black and the red regression lines (see pic below).

Is it possible to manually add another legend that shows the black line is called "Between" and red line is called "Within"?

library(tidyverse)

dat <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/cw2.csv')
dat$groups <- factor(dat$groups)

dat2 <- dat %>% group_by(groups) %>% summarize(mean_x = mean(x),
                                               mean_y = mean(y),
                                               .groups = 'drop')
dat %>% 
  ggplot() +
  aes(x, y, color = groups, shape = groups)+
  geom_point(size = 2) + theme_classic()+ 
  stat_ellipse(level = .6) +
  geom_point(data = dat2, 
             mapping = aes(x = mean_x, y = mean_y,fill = factor(groups)),
             size = 4, show.legend = F,shape=21) +
  geom_smooth(data = dat2, mapping = aes(x = mean_x, y = mean_y,group=1), 
          method = "lm", se=F, color = 1, formula = 'y ~ x')+ 
  geom_smooth(aes(group = 1), 
              method = "lm", se=F, color = 2, formula = 'y ~ x')+
  scale_fill_manual(values=rep('black',3))

enter image description here

rnorouzian
  • 7,397
  • 5
  • 27
  • 72

1 Answers1

7

It looks like you need a second color scale to do this. You can use the ggnewscale package:

library(ggnewscale)

dat %>% 
  ggplot() +
  aes(x, y, color = groups, shape = groups) +
  geom_point(size = 2) + 
  theme_classic() + 
  stat_ellipse(level = .6) +
  geom_point(data = dat2, 
             mapping = aes(x = mean_x, y = mean_y),
             size = 4, show.legend = FALSE, shape = 21, fill = "black") +
  scale_color_discrete() +
  new_scale_color() +
  geom_smooth(data = dat2, 
              mapping = aes(x = mean_x, y = mean_y, group = 1, color = "black"), 
          method = "lm", se = FALSE, formula = 'y ~ x') + 
  geom_smooth(aes(group = 1, color = "red"), 
              method = "lm", se = FALSE, formula = 'y ~ x') +
  scale_color_identity(name = "", labels = c("Between", "Within"),
                       guide = guide_legend())

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Thanks Allan. But is it possible to create a completely manual legend as is the case with BASE R `legend()` in `ggplot2`? I will be OK with something close what I described. – rnorouzian Nov 01 '20 at 06:57
  • @morouzian see my update. The way ggplot2 works is that it insists anything you display in a legend has to be mapped to some aesthetic in the plot. It's actually a very sensible way to do things, and less error-prone than manually creating a legend. It's also much more flexible than you'd imagine. There is almost never a need to "manually" build a legend. You can see here that I have effectively done that anyway, but using ggplot's system. I have specified the names, colors, titles and produced the desired result. – Allan Cameron Nov 01 '20 at 08:45
  • Accepted, but not sure if I would completely agree with your last comment, analyst must be given the freedom that s/he needs as well. Also, it seems kind of ridiculous that to simply add a legend a completely new R package would be required, something I really wanted to avoid. – rnorouzian Nov 01 '20 at 19:20
  • @morouzian my point was that you can build a legend quite easily and transparently this way. You don't _need_ a different package. There are other ways to do it but they are longer and more cumbersome. Still, the ability to add extra Color scales is definitely a shortcoming in ggplot. That's what the new package helps with. – Allan Cameron Nov 01 '20 at 19:25