Oh... this is a fun one to dissect.
To pinpoint when various changes take place, I ran debugonce(ggplot_build)
, to see what goes on underneath the hood when a ggplot object is being printed, and the following steps show up when I print p2
:
# earlier steps omitted
data <- by_layer(function(l, d) l$layer_data(plot$data),
layers, data, "computing layer data")
data <- by_layer(function(l, d) l$setup_layer(d, plot),
layers, data, "setting up layer")
layout <- create_layout(plot$facet, plot$coordinates)
data <- layout$setup(data, plot$data, plot$plot_env)
data <- by_layer(function(l, d) l$compute_aesthetics(d,
plot), layers, data, "computing aesthetics")
data <- lapply(data, scales_transform_df, scales = scales)
# later steps omitted
Let's run through the steps sequentially, & print out our data object in console to see what has happened to it after every step
Step 1: Computing layer data
[[1]]
x y xerr
1 1 1 0.1
2 2 2 0.2
[[2]]
x y xerr
1 1 1 0.1
2 2 2 0.2
[[3]]
x y xerr
1 1 1 0.1
2 2 2 0.2
Nothing interesting to see here. The inputted data frame df
has simply been replicated for each layer of the ggplot object.
Step 2: Setting up layer
# no change from above
Moving on.
Step 3: Creating layout & running layout$setup
on data
[[1]]
x y xerr PANEL
1 1 1 0.1 1
2 2 2 0.2 1
[[2]]
x y xerr PANEL
1 1 1 0.1 1
2 2 2 0.2 1
[[3]]
x y xerr PANEL
1 1 1 0.1 1
2 2 2 0.2 1
Panel column added. Irrelevant for our investigation since we aren't messing around with facets (i.e. PANEL = 1 throughout).
Step 4: Computing aesthetics
[[1]]
x y PANEL group
1 1 1 1 -1
2 2 2 1 -1
[[2]]
xmin xmax x y PANEL group
1 0.9 1.1 1 1 1 -1
2 1.8 2.2 2 2 1 -1
[[3]]
xwidth x y PANEL group
1 0.1 1 1 1 -1
2 0.2 2 2 1 -1
Finally, the different data layers are starting to distinguish themselves from one another. For each layer, new columns are added based on its specific aesthetic mappings, and unused columns from the original dataset are stripped away. A group column has also been added at the back.
Step 5: Scales transformation
[[1]]
x y PANEL group
1 0.00000 1 1 -1
2 0.30103 2 1 -1
[[2]]
x xmin xmax y PANEL group
1 0.00000 -0.04575749 0.04139269 1 1 -1
2 0.30103 0.25527251 0.34242268 2 1 -1
[[3]]
x xwidth y PANEL group
1 0.00000 0.1 1 1 -1
2 0.30103 0.2 2 1 -1
Here, the 2nd and 3rd data layers have become truly different from one another. In layer 2, the scales transformation log10(.)
has been applied directly to x
, xmin
and xmax
, while layer 3 only received the same transformation for x
.
There are two issues here. I have a workaround for one issue, but it's useless because the second issue remains.
Issue 1: No transformation on xwidth
.
If we dig into scales_transform_df
to see how it works, we'll find an exhaustive list of column names that scale_x_log10
will consider, when performing transformations. This can be assessed at the surface debugging level with scales$scales[[1]]$aesthetics
, and corresponds to ggplot_global$x_aes
:
[1] "x" "xmin" "xmax" "xend" "xintercept" "xmin_final"
[7] "xmax_final" "xlower" "xmiddle" "xupper" "x0"
Okay, we can rename "xwidth"
to one of the above, no biggie. Call it xmiddle
, for example, & we'll go from xmin = log_10(x) - xwidth; xmax = log(x) + xwidth
(OP's original situation) to xmin = log_10(x) - log_10(xwidth); xmax = log_10(x) + log_10(xwidth)
. That's closer, but still not good enough, which brings us to...
Issue 2: The data transformation defined in GeomMyerrorbarh
's setup_data
function happens much, much later.
In my copy of the ggplot_build.ggplot
function, the scales transformation happens in line 18, and the calculation for xmin = x - width
/ xmax = x + width
defined in setup_data
is called by compute_geom_1()
in line 28. If we want the log_10(.)
transformation applied to the calculated xmin
/ xmax
values, these calculations have to happen before the scales transformation.
Is it worth the trouble to address this within ggplot_build
?
I'm leaning towards no, because I think it's not a Geom's core job to perform data transformations.
I'm not familiar with the thinking behind the function's design, but I imagine a change such as bringing up the geom's setup_data
(or, equivalently, shoving down scales_transform_df
) will be a non-trivial one, potentially breaking other things along the way.
This use case sounds like it can be more easily served with a wrapper function around one or more existing geom_*()
functions that accept the final xmin
/ xmax
values, and perform data transformations within the wrapper.
Has it occurred for other Geoms?
Somewhat surprisingly (to me at least), yes.
This exact problem shows up in the ggplot2 package's own geom_tile
function, as its underlying GeomTile
performs the data transformation in setup_data
too. Here's a simple illustration to trigger it, using my current version (ggplot2 3.4.0):
library(ggplot2)
df <- data.frame(
x = rep(c(3, 6, 8, 10, 13), 2),
y = rep(c(1, 2), each = 5),
z = factor(rep(1:5, each = 2)),
w = rep(diff(c(0, 4, 6, 8, 10, 14)), 2)
)
p1 <- ggplot(df, aes(fill = z)) +
geom_rect(aes(xmin = x - w/2, xmax = x + w/2,
ymin = y - 0.5, ymax = y + 0.5),
colour = "grey50", alpha = 0.5, linewidth = 1) +
geom_tile(aes(x = x, y = y, width = w, height = 1),
colour = "grey50", alpha = 0.5, linewidth = 1) +
ggtitle("linear scales") +
theme_void() +
theme(legend.position = "none")
p2 <- p1 + scale_x_log10() + ggtitle("transformed x scale")
p3 <- p1 + scale_y_log10() + ggtitle("transformed y scale")
p4 <- p1 + scale_x_log10() + scale_y_log10() + ggtitle("transformed both scales")
library(patchwork)
(p1 | p2) / (p3 | p4)
The geom_rect
layer accepts the aesthetic mappings c(xmin, xmax, ymin, ymax)
, while geom_tile
accepts c(x, y, width, height)
. They look identical when default linear scales are used, but go out of sync once transformed scales are introduced in either direction. The geom_tile
version even overlaps with itself!

That said, transforming scales while drawing tiles (for a heatmap?) seems like a rather niche use case, and I haven't seen this issue brought up elsewhere before. Perhaps a cautionary note in the help files, warning against using scale transformations, would suffice, unless the community has more pressing arguments for its usefulness.