0

I am trying to show tidal height as geom_area on a secondary y-axis, but have been unsuccessful so far.

Here is my code/graph that I currently have:

#Creating pH Time series plots both sites
ggplot(data = RB_EOS,
       aes(
         x = Time,
         y = pH_corrected_average,
         color = Site,
         group = Site
       )) +
  geom_line() +
  geom_line(aes(y = 7.55 + Tidal_Depth_ft * 0.075),
            linetype = "dotdash") +
  scale_y_continuous(
    name = "pH",
    limits = c(7.55, 7.85),
    breaks = seq(7.55, 7.85, 0.05),
    sec.axis = sec_axis( ~ (. - 7.55) / 0.075,
                         name = "Tidal Height (ft)",
                         breaks = seq(0, 4, 1))
  ) +
  geom_point() +
  #geom_point(aes(y = 7.55 + Tidal_Depth_ft*0.075)) +
  facet_wrap(~ Week, nrow = 3) +
  theme_bw(base_size = 13) +
  theme(legend.position = "top") +
  ggtitle("pH Tidal Trends") +
  labs(subtitle = "Summer 2022") +
  xlab("Time (PST)") +
  ylab("pH") +
  scale_x_discrete(guide = guide_axis(angle = 90))

plot for code above

This was the line plot I created of my pH values (y) over time (x), and then I inputted a secondary y-axis to show the tidal height. I was able to show a line of the tidal height (without the points), which I changed to a dotdash- this was done to show the correspondance of that tidal height to the pH values of its respective site, but I am unable to apply the geom_area function appropriately to the secondary y-axis for tidal height to show the values over a tidal cycle over a smoothed tidal area curve.

I tried this code and nothing happened:

#Creating pH Time series plots both sites
ggplot(data = RB_EOS,
       aes(
         x = Time,
         color = Site,
         group = Site
       )) +
  geom_line(aes(y = pH_corrected_average)) +
  geom_area(aes(y = 7.55 + Tidal_Depth_ft * 0.075)) +
  scale_y_continuous(
    name = "pH",
    limits = c(7.55, 7.85),
    breaks = seq(7.55, 7.85, 0.05),
    sec.axis = sec_axis( ~ (. - 7.55) / 0.075,
                         name = "Tidal Height (ft)",
                         breaks = seq(0, 4, 1))
  ) +
  geom_point(aes(y = pH_corrected_average)) +
  #geom_point(aes(y = 7.55 + Tidal_Depth_ft*0.075)) +
  facet_wrap(~ Week, nrow = 3) +
  theme_bw(base_size = 13) +
  theme(legend.position = "top") +
  ggtitle("pH Tidal Trends") +
  labs(subtitle = "Summer 2022") +
  xlab("Time (PST)") +
  ylab("pH") +
  scale_x_discrete(guide = guide_axis(angle = 90))

2nd plot for code above

As you can see, it didn't come out the way I expected, any help appreciated. Thanks!

chemdork123
  • 12,369
  • 2
  • 16
  • 32

1 Answers1

1

Your area geom is not being drawn on the plot because of the limits set by scale_y_continuous(). I'm going to bet that if you comment out scale_y_continuous(...), you will see the geom_area() drawn. You can tell that's the case, since OP is having the legend drawn properly for the area geom, but just not seeing it drawn on the plot.

The solution is to use coord_cartesian(...) instead of scale_y_continuous(...).

The reason why is that when you set the limits= argument in scale_y_continuous(), it causes data outside of those limits to not be included in whatever is drawn on the plot. While this is fine for geom_line(), geom_area() works a bit differently. The area drawn by geom_area() requires the upper limit (your line) and the lower limit (y=0) to be included in the area, or it is not drawn. Here's an example.

Example of the issue

Here's an example plot code:

set.seed(8675309)
df <- data.frame(
  x=rep(1:100, 2),
  y=c(sample(45:50, 100, replace=T), sample(30:35, 100, replace=T)),
  y1=c(sample(10:13, 100, replace=T), sample(3:7, 100, replace=T)),
  category=rep(c("A", "B"), each=100))

plot <-
ggplot(df, aes(x=x, y=y, color=category, fill=category)) + theme_bw() +
  geom_line() +
  geom_area(aes(y=y1), linetype=2, alpha=0.2)

Giving you this:

enter image description here

If we set limits on the y axis to be above 0, the "lower" area geom disappears. Note here that ylim(5,50) is really just a wrapper for scale_y_continuous(limits=c(5,50)):

plot + ylim(5,50)

enter image description here

Note that the "upper" area geom remains (the one in red). This is because that area is drawn with all lower values that are above the lower limit we set of 5. If we increase the limit so that the lower points in the red area are excluded, the entire geom disappears:

plot + ylim(8,50)

enter image description here

Importantly, note that just like in OP's case, the legend is still drawn. This is the "smoking gun" here to indicate that there's nothing wrong with the geom_area command - it's with the limits set.

The fix

The way to solve this is to use coord_cartesian(...) instead of scale_y_continuous(...). Unlike ylim(), coord_cartesian() changes what is shown in the plot area but does not remove any points outside of those limits. They are, in effect, still "drawn", but the plot is just zoomed in to the noted area. ylim() or scale_*_(limits=...) functions actually remove data points outside the indicated limits.

So, when we do the following, you can still see each geom drawn (even the "tips" of the bottom area geom):

enter image description here

I'm pretty sure this is what's happening with OP's problem. You can use coord_cartesian(sec_axis=...) to set the secondary y axis.

chemdork123
  • 12,369
  • 2
  • 16
  • 32