2

Based on the example in Master Software Development in R, I wrote a new geom_my_point(), adapting the alpha depending on the number of data points.

This works fine, but the alpha value of the label is not correct if alpha is explicitly set.

Here the code for the figures:

d <- data.frame(x = runif(200))
d$y <- 1 * d$x + rnorm(200, 0, 0.2)
d$z <- factor(sample(c("group1", "group2"), size = 200, replace = TRUE))

require("ggplot2")
gg1 <- ggplot(d) + geom_my_point(aes(x, y, colour = z)) + ggtitle("gg1")
gg2 <- ggplot(d) + geom_my_point(aes(x, y, colour = z), alpha = 1) + ggtitle("gg2")
gg3 <- ggplot(d) + geom_my_point(aes(x, y, colour = z, alpha = z)) + ggtitle("gg3")

enter image description here

Here the code for the geom_*():

geom_my_point <- function(mapping = NULL, data = NULL, stat = "identity",
                            position = "identity", na.rm = FALSE,
                            show.legend = NA, inherit.aes = TRUE, ...) {
  ggplot2::layer(
    geom = GeomMyPoint, mapping = mapping,
    data = data, stat = stat, position = position,
    show.legend = show.legend, inherit.aes = inherit.aes,
    params = list(na.rm = na.rm, ...)
  )
}


GeomMyPoint <- ggplot2::ggproto("GeomMyPoint", ggplot2::Geom,
  required_aes = c("x", "y"),
  non_missing_aes = c("size", "shape", "colour"),
  default_aes = ggplot2::aes(
    shape = 19, colour = "black", size = 2,
    fill = NA, alpha = NA, stroke = 0.5
  ),

  setup_params = function(data, params) {
    n <- nrow(data)
    if (n > 100 && n <= 200) {
      params$alpha <- 0.3
    } else if (n > 200) {
      params$alpha <- 0.15
    } else {
      params$alpha <- 1
    }
    params
  },

  draw_panel = function(data, panel_scales, coord, alpha) {
    if (is.character(data$shape)) {
      data$shape <- translate_shape_string(data$shape)
    }

    ## Transform the data first
    coords <- coord$transform(data, panel_scales)

    ## Get alpha conditional on number of data points
    if (any(is.na(coords$alpha))) {
      coords$alpha <- alpha
    }

    ## Construct a grid grob
    grid::pointsGrob(
      x = coords$x,
      y = coords$y,
      pch = coords$shape,
      gp = grid::gpar(
        col = alpha(coords$colour, coords$alpha),
        fill = alpha(coords$fill, coords$alpha),
        fontsize = coords$size * ggplot2::.pt + coords$stroke * ggplot2::.stroke / 2,
        lwd = coords$stroke * ggplot2::.stroke / 2
      )
    )
  },

  draw_key = function(data, params, size) {
    data$alpha <- params$alpha
    ggplot2::draw_key_point(data, params, size)
  }
)

EDIT:

According to the comment of @teunbrand, the problem for the plot qq2 can be solved by the following adaptions to the draw_key() function:

draw_key = function(data, params, size) { 
  if (is.na(data$alpha)) { 
    data$alpha <- params$alpha
  } 
  ggplot2::draw_key_point(data, params, size)
}

But this still does not solve the problem with the graph qq3 - so the underlying question is why alpha is not correctly represented by the data argument of the draw_key() function. Compare also the following plot qq4, in which the size is correctly displayed in the legend (set a browser() w/i draw_key()):

gg4 <- ggplot(d) + geom_my_point(aes(x, y, colour = z, alpha = z, size = z)) + ggtitle("gg4")

enter image description here

moremo
  • 315
  • 2
  • 11
  • 1
    If you do `data$alpha[is.na(data$alpha)] <- params$alpha` instead of `data$alpha <- params$alpha` case 1 and 2 work, but I can't seem to figure out why case 3 won't work. For some reason the appropriate alpha isn't included in that case (but is for GeomPoint). – teunbrand Oct 19 '21 at 16:40
  • @teunbrand, as with most of my questions about `ggproto()`, thanks for the super fast and correct reply. This is certainly a better workaround, which works fine for the plot gg2. But I think the real question is, why is `data` not properly adapted inside the function `draw_key()`? Compare also my edit in the original question. – moremo Oct 20 '21 at 07:14
  • 1
    If you work in RStudio and put a `browser()` somewhere at the top of the `draw_key` method and then render the plot, you can inspect the variables as they pass through the function. For reasons I don't understand, both datapoints in gg3 have `NA` for the `alpha` column, whereas they should have non-missing alphas. This doesn't occur in vanilla `geom_point()` so I'm confused here as well. – teunbrand Oct 20 '21 at 07:31
  • @teunbrand, I totally agree and that's the point I was trying to make in the example gg4. Inside the `draw_key()` function `data$size` is passed correctly with 0.1 and 0.9 respectively, `data$alpha` incorrectly remains at NA. – moremo Oct 20 '21 at 07:47

0 Answers0