0

Following this post about plotting GAM's smooth components in the same panel, I would like to further edit the plot to make the y-axis easier to interpret (after looking at this workshop, specifically in "Transforming Standard Errors (2)", where it is mentioned that we can basically add the intercept to make the y-axis more interpretable).

However, two questions were raised in the comments section, so (1) whether I wanted to only shift the values up (which would be shift() component of the plot() function), or (2) if I wanted to include the uncertainty in the estimate of the constant term.

Question:

I did try to apply the two suggestions, using the example provided in one of the comments, but it did not work so well. I followed some help in the comments but it didn't completely work out:

Example (data used here):

library("gratia")    
set.seed(3)
dat <- gamSim(1,n=1500,dist="normal",scale=20)
dat$fac <- as.factor( sample(c("A1", "A2", "A3"), nrow(dat), replace = TRUE) )
dat$logi <- as.logical( sample(c(TRUE, FALSE), nrow(dat), replace = TRUE) )
bs <- "cr"; k <- 12
b <- gam(y ~ x0 + x1 + I(x1^2) + s(x2,bs=bs,k=k) + fac + x3:fac + I(x1*x2) + logi,
         data=dat)

Plot shifting the values (1)

Here, I'm guessing I need to extract the values from the model - how is this done?

Then, I would need to add the intercept coef(b)[1L] to the estimate, lower_ci and upper_ci. If I want to plot two models in the same plot (as here), I would need to repeat this procedure for each model, as the intercept will be different. Is this correct?

Plot including uncertainty (2)

This doesn't seem to work so well with a smooth factor

 ds <- data_slice(b, x2 = evenly(x2)) |> # assuming I want to plot x2
          dplyr::mutate(logi = as.logical(logi))
    
 ds
# A tibble: 100 × 6
        x2    x0    x1 fac   logi     x3
     <dbl> <dbl> <dbl> <fct> <lgl> <dbl>
 1 0.00173 0.511 0.521 A1    FALSE 0.476
 2 0.0118  0.511 0.521 A1    FALSE 0.476
 3 0.0219  0.511 0.521 A1    FALSE 0.476
 4 0.0320  0.511 0.521 A1    FALSE 0.476
 5 0.0421  0.511 0.521 A1    FALSE 0.476
 6 0.0521  0.511 0.521 A1    FALSE 0.476
 7 0.0622  0.511 0.521 A1    FALSE 0.476
 8 0.0723  0.511 0.521 A1    FALSE 0.476
 9 0.0824  0.511 0.521 A1    FALSE 0.476
10 0.0925  0.511 0.521 A1    FALSE 0.476
    
    
fv <- fitted_values(b, data = ds, terms = c("(Intercept)", "x2"))
    
fv
# A tibble: 100 × 10
        x2    x0    x1 fac   logi     x3 fitted    se lower upper
     <dbl> <dbl> <dbl> <fct> <lgl> <dbl>  <dbl> <dbl> <dbl> <dbl>
 1 0.00173 0.511 0.521 A1    FALSE 0.476   5.75  2.55 0.753  10.8
 2 0.0118  0.511 0.521 A1    FALSE 0.476   5.75  2.55 0.753  10.8
 3 0.0219  0.511 0.521 A1    FALSE 0.476   5.75  2.55 0.753  10.8
 4 0.0320  0.511 0.521 A1    FALSE 0.476   5.75  2.55 0.753  10.8
 5 0.0421  0.511 0.521 A1    FALSE 0.476   5.75  2.55 0.753  10.8
 6 0.0521  0.511 0.521 A1    FALSE 0.476   5.75  2.55 0.753  10.8
 7 0.0622  0.511 0.521 A1    FALSE 0.476   5.75  2.55 0.753  10.8
 8 0.0723  0.511 0.521 A1    FALSE 0.476   5.75  2.55 0.753  10.8
 9 0.0824  0.511 0.521 A1    FALSE 0.476   5.75  2.55 0.753  10.8
10 0.0925  0.511 0.521 A1    FALSE 0.476   5.75  2.55 0.753  10.8

Now this provides no variation in the fitted/CI values - What am I doing wrong here?

Thanks in advance for any help!

mto23
  • 303
  • 1
  • 5
  • 15
  • Why are you converting the factor `fac` to a logical? That's the source of the problem as that turns your factor values into logical `NA` values, and if those values are `NA` then the predictions must, by definition be `NA` because even though you are not including the effect of `fac` in the predicted values, `predict.gam` will proceed to generated fitted values only on the set of complete data supplied to it. – Gavin Simpson May 11 '23 at 13:21
  • Thank you, you're correct, that was a mistake! I edited the question, as I came up with other issues... – mto23 May 11 '23 at 15:08
  • You need to spell out the names of the *terms* in full when using `terms` or `exclude`. Your model doesn't contain any term with the label `"x2"`. You have a term `s(x2, bs = bs, k = k)` which mgcv labels `"s(x2)"` (see `summary(b)` or `gratia::smooths(b)` for the labels of smooths). So you need `fv <- fitted_values(b, data = ds, terms = c("(Intercept)", "s(x2)"))`. With your code currently, `x2` is varying in the data slice, but you didn't include this smooth term in the predictions, instead only including the intercept term, which is why your predictions are constant. – Gavin Simpson May 11 '23 at 17:05

0 Answers0