2

I am using mgcv to fit GAMs with random effects, e.g.:

gam_fit <- gam(y ~ s(age) + s(region, bs='re'), 
               data = my_data, 
               method = 'REML')

See Gavin Simpson's excellent post about using random effects in GAMs with mgcv.

Two questions:

  1. How to extract estimates of the random effects ? I found extract_ranef() in a separate package, but maybe mgcv has its own method ?
  2. In plot(gam_fit), what is being plotted in the effects vs Gaussian quantiles plot ? How should these plots be used ?
Shira
  • 119
  • 5

1 Answers1

0

How to extract estimates of the random effects ? I found extract_ranef() in a separate package, but maybe mgcv has its own method ?

You can use coef(gam_fit), but this will also include the coefficients for the spline basis of s(age). So to recover only those, I would use:

coefs <- coef(gam_fit)
coefs[grep("s(region)", names(coefs), fixed=TRUE)]

In plot(gam_fit), what is being plotted in the effects vs Gaussian quantiles plot? How should these plots be used?

On the x-axis, it shows the gaussian quantiles; these reflect the values of a standard normally distributed variable. On the y-axis, it shows the predicted values of the random intercept. For mixed-effects models, these are assumed to follow a normal distribution. Thus, any deviation of the points from the straight line indicate a deviation from what would be expected for a normal distribution. If points on the left-most part of the x-axis go below the straight line, this indicates that some predicted random intercepts have lower values than would be expected for a normal distribution. If points on the right-most part of the x-axis go below the straight line, this indicates some predicted random intercepts have higher values than what would be expected for a normal distribution. If you observe both (or both go above and below the straight line, respectively) this indicates that the kurtosis or thickness of the tail(s) is different than for a normal distribution. I'd expect that such deviations would mostly affect inference and predictive accuracy only to a much lesser extent.