3

I'm trying to run a GAM model where the interaction between X and Y is modelled using a Gaussian process. The code below works fine when using the default smooth (s()) in mgcv, but I would like to model my data with a tensor product (te()) as I understand te products specifically address anisotropic interactions. However, I cannot seem to pass the required parameters to the model when I use te(). Example below.

library(mgcv)
set.seed(540)
df <- gamSim(2, n = 300, scale = 0.15)[[1]]
df$x <- scales::rescale(df$x, to = c(-180,180))
df$y <- scales::rescale(df$y, to = c(-90,90))
head(df)
# works fine using s()
m1 <- gam(z ~ s(x, y, bs = "gp", m = c(2, 5, 2), k = 30), data = df)
summary(m1)
Family: gaussian 
Link function: identity 

Formula:
z ~ s(x, y, bs = "gp", m = c(2, 5, 2), k = 30)

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.46143    0.01581   29.19   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
         edf Ref.df     F p-value
s(x,y) 8.055  12.73 0.333   0.983

R-sq.(adj) =  0.0248   Deviance explained = 5.11%
GCV = 0.077292  Scale est. = 0.074959  n = 300

# Fails
# pass list as per example in ?te for multivariate marginal
mp <- list(c(2, 5, 2))
m2 <- gam(z ~ te(x, y, bs = "gp", m = mp, k = 30), data = df)
Error in gpE(knt, knt, object$p.order) : 
  incorrect arguments to GP smoother
In addition: Warning message:
In mgcv::te(x, y, bs = "gp", m = mp, k = 30) : m wrong length and ignored.

# try passing directly
m2 <- gam(z ~ te(x, y, bs = "gp", m = c(2,5,2), k = 30), data = df) # FAILS
Error in gpE(knt, knt, object$p.order) : 
  incorrect arguments to GP smoother
In addition: Warning message:
In mgcv::te(x, y, bs = "gp", m = c(2, 5, 2), k = 30) :
  m wrong length and ignored.

# don't pass m
m2 <- gam(z ~ te(x, y, bs = "gp", k = 10), data = df) # works
summary(m2)
Family: gaussian 
Link function: identity 

Formula:
z ~ te(x, y, bs = "gp", k = 10)

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.46143    0.01299   35.52   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
          edf Ref.df     F p-value    
te(x,y) 14.69  20.14 7.633  <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.341   Deviance explained = 37.4%
GCV = 0.05341  Scale est. = 0.050618  n = 300

gam.check(m2) # k has been ignored!?
Method: GCV   Optimizer: magic
Smoothing parameter selection converged after 15 iterations.
The RMS GCV score gradient at convergence was 6.499192e-07 .
The Hessian was positive definite.
Model rank =  100 / 100 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

          k'  edf k-index p-value
te(x,y) 99.0 14.7    1.08    0.94
KaanKaant
  • 434
  • 3
  • 16

2 Answers2

0

I solved this after realising that d has to be explicitly set inside the te() function. Code below works.

library(mgcv)
set.seed(540)
df <- gamSim(2, n = 300, scale = 0.15)[[1]]
df$x <- scales::rescale(df$x, to = c(-180,180))
df$y <- scales::rescale(df$y, to = c(-90,90))
head(df)

m1 <- gam(z ~ s(x, y, bs = "gp", m = c(2, 5, 2), k = 30), data = df, method = "ML")

mp <- list(c(2, 5, 2))
m2 <- gam(z ~ te(x, y, bs = "gp", m = mp, d = 2, k = 30), data = df, method = "ML")
KaanKaant
  • 434
  • 3
  • 16
  • I'm not sure this is doing what you think it is. I would expect these to be the same because the `te` smooth you specified is requesting a 2d gp smooth as the *only* marginal basis. I would check the number of smoothing parameters estimated in the two models `m1$sp` and `m2$sp` to see if they are different. For anisotropy in `te` smooths I would expect that you'd need two separate marginal GP bases, `te(x, y, bs = "gp", m = list(x = mp, y = mp), k = 30)` and you probably want to change the length scales in `mp` for x and y if you want anisotropy. – Gavin Simpson Dec 21 '18 at 17:12
0

What you want does not seem to exist (yet?), i.e., Gaussian process (GP) smooths are isotropic in nature. You can see that they are isotropic from the possible auto-correlation functions available for GP smooths in package MGCV (see page 242 from Wood 2017 or simply type "?gp.smooth" in R after loading the library): all of them depend only on the distance d between data points. If there would be anything like anisotropic GP smooths, you would have the possibility of defining different values for the auto-correlation scale for each axis, which does not seem to be the case.

If your x and y data have different units and you really need a GP smooth the only thing you can do, as far as I can tell, is to scale them into unit ("as is often done in LOESS smoothing", see page 227, Wood 2017).

Wood 2017: "Generalized Additive Models. An Introdcution with R. Second edition".

nukimov
  • 216
  • 1
  • 7