If you use ggplot, you can make the smooths right in geom_smooth
. As ggplot demands long form, using list columns and tidyr::unnest
is a useful substitute for replicate
, though there are lots of ways to accomplish the data generation step.
library(tidyverse)
set.seed(47)
# A nice theme with a white background to help make low-opacity objects visible
theme_set(hrbrthemes::theme_ipsum_tw())
df <- tibble(replication = seq(100), # scaled down a little
x = map(replication, ~runif(1000, 0, 10)),
y = map(x, ~rpois(1000, lambda = 2*.x + 0.2*.x*sin(.x)))) %>%
unnest()
# base plot with aesthetics and points
point_plot <- ggplot(df, aes(x, y, group = replication)) +
geom_point(alpha = 0.01, stroke = 0)
point_plot +
geom_smooth(method = lm, formula = y ~ splines::ns(x), size = .1, se = FALSE)

Controlling the line's alpha can be really helpful for this sort of plot, but the alpha
parameter in geom_smooth
controls the opacity of the standard error ribbon. To set the alpha of the line, use geom_line
with stat_smooth
:
point_plot +
stat_smooth(geom = 'line', method = lm, formula = y ~ splines::ns(x),
color = 'blue', alpha = 0.03)

Currently, the smooth isn't doing much more than OLS here. To make it more flexible, set the degrees of freedom:
point_plot +
stat_smooth(geom = 'line', method = lm, formula = y ~ splines::ns(x, df = 5),
color = 'blue', alpha = 0.03)

Given the response is Poisson, it may be worth scaling up to Poisson regression with glm
. The largest impact here is that when x is small, y doesn't dip all the way to 0:
point_plot +
stat_smooth(geom = 'line', method = glm, method.args = list(family = 'poisson'),
formula = y ~ splines::ns(x, df = 5), color = 'blue', alpha = 0.03)

Adjust further as you like.