96

Here's an example of a binned density plot:

library(ggplot2)
n <- 1e5
df <- data.frame(x = rexp(n), y = rexp(n))
p <- ggplot(df, aes(x = x, y = y)) + stat_binhex()
print(p)

enter image description here

It would be nice to adjust the color scale so that the breaks are log-spaced, but a try

my_breaks <- round_any(exp(seq(log(10), log(5000), length = 5)), 10)
p + scale_fill_hue(breaks = as.factor(my_breaks), labels = as.character(my_breaks))

Results in an Error: Continuous variable () supplied to discrete scale_hue. It seems breaks is expecting a factor (maybe?) and designed with categorical variables in mind?

There's a not built-in work-around I'll post as an answer, but I think I might just be lost in my use of scale_fill_hue, and I'd like to know if there's anything obvious I'm missing.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294

2 Answers2

158

Yes! There is a trans argument to scale_fill_gradient, which I had missed before. With that we can get a solution with appropriate legend and color scale, and nice concise syntax. Using p from the question and my_breaks = c(2, 10, 50, 250, 1250, 6000):

p + scale_fill_gradient(name = "count", trans = "log",
                        breaks = my_breaks, labels = my_breaks)

enter image description here

My other answer is best used for more complicated functions of the data. Hadley's comment encouraged me to find this answer in the examples at the bottom of ?scale_gradient.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • Man, you have two "best" answers for the same question :-). Awesome! – Eduardo Aug 08 '14 at 12:59
  • 4
    @Eduardo... well the question is mine too. Glad you're finding it useful! – Gregor Thomas Aug 08 '14 at 15:09
  • well, `log` or `log10` or `sqrt` is bulit_in function, now I want to transform by dividing 1000, so I use `trans_new` function in package `scales` and write my own func `sci_trans <- function(){ trans_new('sci', function(x) x/1000, function(x) x*1000)} p + scale_fill_gradient(trans='sci')`, but it does not work, what should I do? Thank you – Ling Zhang Dec 01 '16 at 08:19
  • i notice this works for all scale functions that use `continuous_scale` (e.g. `scale_fill_continuous`), not just `scale_fill_gradient` – arvi1000 May 20 '19 at 20:53
  • This formulation is still functional with R 3.6.3. – Luís de Sousa Jun 28 '21 at 09:24
  • If you need the "round_any" function and use dplyr instead of plyr, can be found here: https://www.rdocumentation.org/packages/plyr/versions/1.8.6/topics/round_any – Amroco Nov 04 '21 at 23:48
20

Another way, using a custom function in stat_summary_hex:

ggplot(cbind(df, z = 1), aes(x = x, y = y, z = z)) + 
  stat_summary_hex(function(z){log(sum(z))})

This is now part of ggplot, but was originally inspired by the wonderful code by by @kohske in this answer, which provided a custom stat_aggrhex. In versions of ggplot > 2.0, use the above code (or the other answer)

ggplot(cbind(df, z = 1), aes(x = x, y = y, z = z)) +
    stat_aggrhex(fun = function(z) log(sum(z))) +
    labs(fill = "Log counts")

To generate this plot.

enter image description here

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • 1
    The aesthetic is `fill`, not `colour`, probably. – joran Nov 09 '11 at 18:51
  • @Andrie, thanks! It seems much more natural to me than messing with a color palette to get colors that are evenly spaced on the log scale. – Gregor Thomas Nov 09 '11 at 19:00
  • 6
    Seems a lot less natural to me. But it's always possible to transform the data or the scale. Transforming the scale will give you a sensible legend. – hadley Nov 13 '11 at 05:51
  • @hadley, well, now that I found the `trans` argument to `scale\_gradient` I agree. "Natural" was probably a poor word choice in the first place too, I think what I was looking for "syntactically simple" and was under the false impression that the (now accepted) answer would be more complicated. That's what I get for not reading all the examples in the help file! – Gregor Thomas Nov 15 '11 at 19:22
  • @kohske the answer is quite good, but it does not work now. I try to modify it, but negative. Would you mind doing me a favor? @Gregor , and I had tried another way to write function `sci_trans <- function(){ trans_new('sci', function(x) x/1000, function(x) x*1000)}` to divide 1000, but it does not work – Ling Zhang Dec 01 '16 at 08:38
  • 1
    @LingZhang As @kohske has written in his answer this can be now archived by `ggplot(cbind(df, z = 1), aes(x = x, y = y, z = z)) + stat_summary_hex(function(z){log(sum(z))})` Hope it helps – bluefish Jan 23 '17 at 22:39
  • I tried this formulation with R 3.6.3 and it fails with the message "Error: `mapping` must be created by `aes()`". You might need to update this answer. – Luís de Sousa Jun 28 '21 at 09:03
  • I'll test and update--but the R version shouldn't really matter, the `ggplot2` version is the main concern. What `ggplot2` version did you have trouble with? – Gregor Thomas Jun 28 '21 at 12:48