Background
When fitting generalised additive models (GAMs) in the past, I have used methods for the mgcv
package outlined in
Pedersen EJ, Miller DL, Simpson GL, Ross N (2019) Hierarchical generalized additive models in ecology: an introduction with mgcv. PeerJ 7: e6876.
I am currently transitioning to Bayesian alternatives and preferentially use the rethinking
package, but have stumbled into some problems when introducing splines to Stan and adding categorical predictor variables.
MRE
Below I follow the code outlined in this book:
McElreath R (2019) Statistical rethinking2: A Bayesian course with examples in R and Stan. Second Edition. Chapman and Hall/CRC
and this Bookdown:
Kurz S (2023) Statistical rethinking with brms, ggplot2, and the tidyverse: Second edition. https://bookdown.org/content/4857
# generate some sinusoidal data
d <- data.frame(x = rep(seq(0, 5 * pi, length.out = 100), 2),
group = rep(c("a", "b"), each = 100))
d$y[1:100] <- 2 * sin(2 * d$x[1:100]) + rnorm(100, 0, 1)
d$y[101:200] <- 5 * sin(3 * d$x[101:200]) + rnorm(100, 0, 0.5)
# generate b splines
knots <- 15
knot_list <- quantile(d$x , probs = seq(0, 1, length.out = knots))
require(splines)
B <- bs(d$x,
knots = knot_list[-c(1, knots)],
degree = 3, intercept = TRUE)
# fit quap model according to McElreath (2019)
require(rethinking)
r.mod <- quap(
alist(
y ~ dnorm(mu, sigma),
mu <- a + B %*% w,
a ~ dnorm(0, 1),
w ~ dnorm(0, 1),
sigma ~ dexp(1)
), data = list(y = d$y, B = B),
start = list(w = rep(0, ncol(B)))
)
precis(r.mod) # model produces parameter estimates
# attempt to fit identical ulam (Stan) model
r.mod.stan <- ulam( # note ulam instead of quap
alist(
y ~ dnorm(mu, sigma),
mu <- a + B %*% w,
a ~ dnorm(0, 1),
w ~ dnorm(0, 1),
sigma ~ dexp(1)
), data = list(y = d$y, B = B),
start = list(w = rep(0, ncol(B))),
chains = 8, cores = 8, iter = 1e4
)
# model fails due to apparently semantic error:
# Ill-typed arguments supplied to assignment operator =: lhs has type real and rhs has type row_vector
# fit brms model according to Kurz (2023)
require(brms)
d$B <- B
b.mod.stan <- brm(data = d,
family = gaussian,
y ~ 1 + B,
prior = c(prior(normal(0, 1), class = Intercept),
prior(normal(0, 1), class = b),
prior(exponential(1), class = sigma)),
iter = 1e4, warmup = 5e3, chains = 4, cores = 4,
seed = 4)
summary(b.mod.stan) # model produces same estimates as quap model
Questions
- Why does the
rethinking
Stan GAM not run while thebrms
GAM does? - How can I incorporate the grouping factor group?
I would appreciate a worked example in rethinking
.