0

I'm trying to plot fitted model effects in ggplot2 as an alternative to the plots returned by the effects package, and I'm running into issues with mapping a continuous grouping factor to scale_color_gradient. These issues stem from the fact that the grouping factor has a skewed distribution. With the default color mapping, most of the colors are indistinguishable from one another, but a log-transformation messes up the legend. I've looked at a few related SO answers (Is there a built-in way to do a logarithmic color scale in ggplot2?, R colour scale for logarithmic data?, and Logarithmic color scale in ggplot2 squishes certain legend numbers) that don't quite fit the bill.

Here's my data:

myEffs <- structure(list(PrimeShiftIndex = c(-4, -0.2, 4, -4, -0.2, 4,
-4, -0.2, 4, -4, -0.2, 4), PrimeVowelDur = c(0.03, 0.03, 0.03,
0.06, 0.06, 0.06, 0.09, 0.09, 0.09, 1.59, 1.59, 1.59), fit = c(-0.184306629528313,
-0.164313919815862, -0.142216714344205, -0.200749305969527, -0.178039844592615,
-0.152939913597082, -0.210367655099129, -0.186068995874736, -0.159212583047775,
-0.278488972243709, -0.242934925102426, -0.203638346683111),
    se = c(0.0437103286485701, 0.0342751848548937, 0.0446524040373885,
    0.0417352317881704, 0.0340007746839495, 0.042093900962637,
    0.0441609220226782, 0.0341565687974652, 0.0442166991273061,
    0.0995662189943997, 0.041203801253227, 0.0993299532144987
    ), lower = c(-0.269979086288845, -0.231493448847753, -0.229735643449126,
    -0.282550563276701, -0.24468152835471, -0.235444164230796,
    -0.296923277064622, -0.253016036857112, -0.245877528409642,
    -0.473639245768052, -0.323694575976204, -0.398325538129639
    ), upper = c(-0.0986341727677806, -0.0971343907839703, -0.0546977852392849,
    -0.118948048662354, -0.11139816083052, -0.0704356629633676,
    -0.123812033133635, -0.119121954892359, -0.0725476376859078,
    -0.0833386987193667, -0.162175274228647, -0.00895115523658357
    )), class = "data.frame", row.names = c(NA, -12L), .Names = c("PrimeShiftIndex",
"PrimeVowelDur", "fit", "se", "lower", "upper"))

Here, the grouping factor PrimeVowelDur is skewed right, with values at 0.03, 0.06, 0.09, and 1.59. Here are some failed attempts to get distinguishable colors and a readable legend (with obnoxiously wide lines to highlight the color contrast or lack thereof).

p <- ggplot(myEffs, aes(x=PrimeShiftIndex, y=fit, group=PrimeVowelDur, color=PrimeVowelDur)) +
  geom_line(size=6)
##Legend fine but line colors indistinguishable
p
##Missuse's suggestion yields the same issue
p + scale_color_gradientn(colors = colorRampPalette(colors = c("#132B43", "#56B1F7"))(nrow(myEffs)), 
                          values = scales::rescale(log(sort(myEffs$PrimeVowelDur))))


##Colors distinguishable but legend messed up
p + scale_color_gradient(trans="log")
##Using trans="log" with pre-defined breaks as per Gregor doesn't make legend much better
brks <- seq(0, 1.6, length.out=5)
p + scale_color_gradient(trans="log", breaks=brks, labels=brks)
##Nor does S Rivero's suggestion
p + scale_color_gradient(trans="log", breaks=brks, labels=brks, guide="legend")

My intuition is that ggplot should be able to naturally handle the p + scale_color_gradient(trans="log") solution without messing up the legend. Anyway, I've got a proposed solution, but I want to see if there's anything savvier out there that I'm missing.

Dan Villarreal
  • 119
  • 1
  • 12
  • If you are grouping by a factor which takes discrete values, shouldn't you be using a discrete colour scale? – neilfws Aug 10 '18 at 01:06
  • Here, the PrimeVowelDur grouping factor is based on a continuous predictor, but it's set at four values for the purposes of constructing model predictions by the effects package. These values represent the minimum (=1Q), median, 3Q, and maximum for PrimeVowelDur in the original data. – Dan Villarreal Aug 10 '18 at 03:23

1 Answers1

2

If you pass breaks based on the values of the continuous grouping factor to the breaks and labels arguments of scale_color_gradient, the legend will appear as expected. This works whether you use the default colorbar guide, or the legend guide:

brks2 <- sort(unique(myEffs$PrimeVowelDur))
p + scale_color_gradient(trans="log", breaks=brks2, labels=brks2)
p + scale_color_gradient(trans="log", breaks=brks2, labels=brks2, guide="legend")
Dan Villarreal
  • 119
  • 1
  • 12