4

How do you override the aes size value for a ggplot2 legend guide based on a column in the data set?

Refer to this example (Edit 2: added Trial C, and changed the line size to use a log scale):

library(data.table)
set.seed(26798)

dt<-rbind(data.table(Trial="A",Value=rweibull(1000,1.0,0.5)),
      data.table(Trial="B",Value=rweibull(100,1.2,0.75)),
      data.table(Trial="C",Value=rweibull(10,1.3,0.8)))

# Add a count and something like a cumulative distribution:
dt2<-dt[order(Trial,Value),list(Value,N=.N),by=Trial][,list(Value,N,y=1-cumsum(N)/sum(N)),by=Trial]
dt2
##      Trial        Value    N     y
##   1:     A 0.0003628745 1000 0.999
##   2:     A 0.0013002615 1000 0.998
##   3:     A 0.0017002173 1000 0.997
##   4:     A 0.0022597343 1000 0.996
##   5:     A 0.0026608082 1000 0.995
##  ---                              
##1096:     B 1.6821827814  100 0.040
##1097:     B 2.2431595707  100 0.030
##1098:     B 2.5122479833  100 0.020
##1099:     B 2.5519954416  100 0.010
##1100:     B 2.6848412995  100 0.000

ggplot(dt2) +
  geom_line(aes(x=Value,y=y,group=Trial,color=Trial,size=N)) +
  scale_size(range=c(0.1, 2), trans="log") +
  guides(size=F, color=guide_legend(override.aes=list(size=2)))

Plot of three trials

I would like the line thickness for each value of Trial in the guide legend to match the line in the plot (i.e. "A" should be thick and "B" should be thin). Edit 1: @Arun and @ChelseaE gave good suggestions for adjusting each thickness manually, but my actual dataset has many factor levels and is constantly changing, so I need it to be "dynamic".

The answer from @DidzisElferts to a similar question (Control ggplot2 legend look without affecting the plot) shows how to set the size to a static value. The size=2 part in the last line of the example above lets me change the line size of the legend, but I would like it to match the size of the line in the plot. Using size=N instead seems logical, but it gives the error "object 'N' not found". What is the correct syntax?

Desired output:

Plot of three trials with desired legend

Community
  • 1
  • 1
dnlbrky
  • 9,396
  • 2
  • 51
  • 64
  • 1
    I don't understand: if you want the line size to correspond to Trial, why do you map it to N? – baptiste Aug 21 '13 at 22:08
  • Well I'm probably not doing something correctly in my implementation, but my intent in the plot is to use the line thickness to show that the number of observations for each trial is different (captured by N). Then I would like the legend to match the plot. My actual data has 5 to 10 lines rather than two, and it's visually helpful to pick out the lines in the plot when the legend matches. – dnlbrky Aug 22 '13 at 12:24

2 Answers2

6

You should set the sizes accordingly for both A and B. You've set just 1 size. Try this:

p <- ggplot(dt2) +
geom_line(aes(x=Value,y=y,group=Trial,color=Trial,size=N)) +
scale_size(range=c(0.1, 2)) +
guides(size=FALSE, color=guide_legend(override.aes=list(size=c(2, .1))))

Following OP's comment:

Okay, in that case, you'll have to do a bit more of work (There maybe easier ways; I can't think of them, if any, at the moment).

scales <- c(0.1, 2) # the range you want: min, max
vals <- summary(lm(scales ~ c(min(dt2$N), max(dt2$N))))$coefficients[,1]
sizes <- vals[2] * unique(dt2$N) + vals[1]

ggplot(dt2) +
geom_line(aes(x=Value,y=y,group=Trial,color=Trial,size=N)) +
scale_size(range=scales) +
guides(size=FALSE, color=guide_legend(override.aes=list(size=sizes)))

This should work. Try it and let me know if you've issues.

Arun
  • 116,683
  • 26
  • 284
  • 387
  • Thanks for the good suggestion, @Arun. However in my actual data set I have a lot of values that are constantly changing, so I need this to be more "dynamic". I've updated the question to reflect this. – dnlbrky Aug 21 '13 at 18:13
  • @dnlbrky, checkout the edit to see if it fits your general scenarios. Let me know if you've trouble. – Arun Aug 21 '13 at 21:07
  • Thanks, this route works. Based on preference I will use this modification: `size = rescale(dt2[, .N, by=list(Trial)][order(Trial), log(N)], scales)`. – dnlbrky Sep 04 '13 at 16:41
1

Try adding the size range inside the guide_legend:

ggplot(dt2) + geom_line(aes(x = Value, y = y, group = Trial, color = Trial, size = N)) + scale_size(range = c(0.1, 2)) + guides(size = F, color = guide_legend(override.aes = list(size = range(0.1,2))))

EDIT: (Might work, not sure)

You could also try creating a vector for N (N <- dt2$N) and then using size = N

Hope this helps.

americo
  • 1,013
  • 8
  • 17