1

I am building a faceted plot where each panel has 3 different sets of points, each with a different shape. Everything works fine, until I try to add some text to each panel using geom_text(). When I include geom_text()in the plot, I get an error message that I have "insufficient values in a manual scale: 4 are needed but only 3 are provided". I can correct this problem by adding additional shapes and colors, but I cannot control the re-mapping of the shape/color factors that occurs when I add geom_text().

Here is the script that I am running:

library(ggplot2)
library(RColorBrewer)
library(cowplot)

make_hist_df <- function (vals, breaks, expt, type) {
  hist_d <- hist(vals,breaks=breaks,plot=FALSE)
  hist_d_nz<-hist_d$counts > 0
  n_d_nz<-length(hist_d$counts[hist_d_nz])
  hist_df <- data.frame(expt=character(n_d_nz), counts=numeric(n_d_nz), mids=numeric(n_d_nz),type=character(n_d_nz))
  hist_df$counts <- hist_d$counts[hist_d_nz]
  hist_df$mids <- hist_d$mids[hist_d_nz]
  hist_df$expt = expt
  hist_df$type = type

  return(hist_df)
  }

## get some normal distributions
n1<-rnorm(n=10000, mean=5,sd=1)
n2<-rnorm(n=5000,mean=15,sd=1)
n3<-rnorm(n=2000,mean=25,sd=1)

breaks=seq(0,30,0.5)

tot_hist_df = rbind(
  make_hist_df(n1,breaks,expt='one',type='low'),
  make_hist_df(n2,breaks,expt='one',type='mid'),
  make_hist_df(n3,breaks,expt='one',type='high')
  )

tot_hist_df = rbind(tot_hist_df,
  make_hist_df(n1,breaks,expt='two',type='low'),
  make_hist_df(n2,breaks,expt='two',type='mid'),
  make_hist_df(n3,breaks,expt='two',type='high')
  )

tot_hist_df$expt<-factor(tot_hist_df$expt,levels=c('one','two'), ordered=TRUE)
tot_hist_df$type<-factor(tot_hist_df$type,levels=c('low','mid','high'), ordered=TRUE)

s.open_circ<-1
s.closed_circ<-16
s.triangle <- 2
s.plus<-4
s.dot <- 20
sb.shapes   = c(s.open_circ, s.triangle, s.closed_circ)
sb.shapes_l = c(s.open_circ, s.triangle, s.closed_circ, s.dot)

q_set <- c('N1','N2','N3')
n_q <- length(q_set)

sb.colors <-brewer.pal(max(3,n_q),'Dark2') # 'Dark2', 'Set2', 'Paired'
sb.colors_l <- c(sb.colors,'black')

sb.sizes = rep(1.25,n_q)

## plot out without labels
p1 <- ggplot(data=tot_hist_df,aes(x=mids, y=counts, shape=type, color=type))+geom_point() +
  scale_color_manual(values=sb.colors) +
  scale_shape_manual(values=sb.shapes) +
  facet_wrap(~expt, ncol=2)

## make a label dataframe
hist_label=data.frame(expt=c('one','two'), lab=c('mean 5, 20','mean 5, 20 - dup'),type=c('xlab','xlab'))
hist_label$expt <- factor(hist_label$expt,levels=c('one','two'),ordered=TRUE)
hist_label$type <- factor(hist_label$type,levels=c('low','mid','high','xlab'),ordered=TRUE)

## plot out without labels
p2 <- ggplot(data=tot_hist_df,aes(x=mids, y=counts, shape=type, color=type))+geom_point() +
  geom_text(data=hist_label, aes(x=10,y=1000,label=lab)) +
  scale_color_manual(values=sb.colors_l) +
  scale_shape_manual(values=sb.shapes_l) +
  facet_wrap(~expt, ncol=2)

plot_grid(p1,p2,ncol=1)

And here is the output that it produces: bad plot

Both the colors and shapes differ between the top and bottom panel.

I do not understand why geom_text() is remapping the factor levels specified by "type", after I have explicitly specified them for both the plotted data and for the label data structure. This remapping (to something that looks alphabetical) throws off both the colors and the shapes.

  • It will be easier for us to help you if provide some extra information. Specifically, it would be helpful if you can post (1) the plots, so people can judge whether they understand the problem. (2) A minimal plotting code that produces the plot with the problems, not just one layer. (3) A snippet of data that in combination with (2) will reproduce the problems illustrated in (1). Data snippets are most easily shared by copying the output of `dput(your_data_snippet)`. – teunbrand May 30 '20 at 10:56
  • I have updated the question with a functional example and plot of the output. – Bill Pearson May 31 '20 at 21:38

1 Answers1

0

Thank you for the reproducible example. It was much clearer to understand what was happening now.

The remapping behaviour you describe is because while learning the discrete values (training the scale), it doesn't smartly update the labels it learned from the histogram data with the labels it learns from the text data. To fix this bit, you can manually set the limits of the scale.

ggplot(data=tot_hist_df,
       aes(x=mids, y=counts, shape=type, color=type))+
  geom_point() +
  geom_text(data=hist_label, 
            aes(x=10,y=1000,label=lab)) +
  scale_color_manual(values=sb.colors_l, limits = levels(hist_label$type)) +
  scale_shape_manual(values=sb.shapes_l, limits = levels(hist_label$type)) +
  facet_wrap(~expt, ncol=2)

enter image description here

EDIT:

Or alternatively, you can set drop = FALSE in the manual scales:

ggplot(data=tot_hist_df,
       aes(x=mids, y=counts, shape=type, color=type))+
  geom_point() +
  geom_text(data=hist_label, 
            aes(x=10,y=1000,label=lab)) +
  scale_color_manual(values=sb.colors_l, drop = FALSE) +
  scale_shape_manual(values=sb.shapes_l, drop = FALSE) +
  facet_wrap(~expt, ncol=2)

END EDIT

Below follow a few tips that you might find useful. The first is that you can use named vectors in manual discrete scales, that will keep the link between the symbol and the colour/shape regardless of whether the symbol is in the data.

This is also useful when you have to make multiple similar plots and you don't know whether only a subset of categories is present in any single plot.

Notice in the plot below the colours are correctly matched to the data, and no legend item is shown for the not used not_in_data symbol/colour mapping.

my_colours <- setNames(object = c(brewer.pal(3, 'Dark2'), 'black', 'blue'), 
                       nm = c(levels(hist_label$type), "not_in_data"))
my_shapes <- setNames(object = c(1, 2, 16, NA), 
                      nm = levels(hist_label$type))

ggplot(data=tot_hist_df,
       aes(x=mids, y=counts, shape=type, color=type))+
  geom_point() +
  geom_text(data=hist_label, 
            aes(x=10,y=1000,label=lab)) +
  scale_color_manual(values=my_colours) +
  scale_shape_manual(values=my_shapes) +
  facet_wrap(~expt, ncol=2)

enter image description here

The second tip is that you can escape having to include the text in the legend by having the text layer not inherit the main plot's aesthetics (which assigns shape and color to the text layer, which it probably doesn't need). Also note that since the scale doesn't need to update for the text's colour and shape, the correct order of factor levels from the histogram is restored.

ggplot(data=tot_hist_df,
       aes(x=mids, y=counts, shape=type, color=type)) + 
  geom_point() +
  geom_text(data=hist_label, 
            aes(x=10,y=1000, label=lab),
            inherit.aes = FALSE) +
  scale_color_manual(values=my_colours) +
  scale_shape_manual(values=my_shapes) +
  facet_wrap(~expt, ncol=2)

enter image description here

An alternative way to achieve the same thing, it to define not aesthetics in the main plot call, but call them in the point layer instead (plot is identical to above).

ggplot(data=tot_hist_df) + 
  geom_point(aes(x=mids, y=counts, shape=type, color=type)) +
  geom_text(data=hist_label, 
            aes(x=10,y=1000, label = lab)) +
  scale_color_manual(values=sb.colors) +
  scale_shape_manual(values=sb.shapes) +
  facet_wrap(~expt, ncol=2)

I hope that helped!

teunbrand
  • 33,645
  • 4
  • 37
  • 63
  • Thank you so much. Your guidance is greatly appreciated. I have a lot to learn about ggplot() shape/color/size mappings. I'm afraid that getting things to work by experimentation can be very slow. I look forward to using your advice. – Bill Pearson Jun 01 '20 at 00:15