I am using gam (from mgcv package in R) to model presence/absence data in 3355 cells of 1x1km (151 presences and 3204 absences). Even though I include a smooth with the spatial locations in the model to address the spatial dependence in my data, the results from a variogram show spatial autocorrelation in the residuals of my gam (range=6000 meters). Since I am modelling a binary response, using a gamm with a correlation structure is not advisable because it "performs poorly with binary data", neither gamm4 because (although is supposed to be appropriate for binary data) it has "no facility for nlme style correlation structures".
The alternative I have found is to fit my model using the function magic from the same mgcv package. Because I found no examples of how to use magic for spatially correlated data I have adapted the ?magic example for temporally correlated data. The results of the output (see below) change the coefficients of the model but does not remove the spatial autocorrelation and the smooth plots show the same effect.
> summary(data)
Object of class SpatialPolygonsDataFrame
Coordinates:
min max
x 670000 780000
y 140000 234000
Is projected: TRUE
proj4string :
[+proj=tmerc +lat_0=0 +lon_0=19 +k=0.9993 +x_0=500000 +y_0=-5300000 +datum=WGS84 +units=m
+no_defs +ellps=WGS84 +towgs84=0,0,0]
Data attributes:
f_edge lat long dam
Min. : 0.0 Min. :670500 Min. :140500 0:3204
1st Qu.: 0.0 1st Qu.:722500 1st Qu.:165500 1: 151
Median : 8.0 Median :743500 Median :181500
Mean :11.5 Mean :736992 Mean :180949
3rd Qu.:21.0 3rd Qu.:756500 3rd Qu.:194500
Max. :50.0 Max. :779500 Max. :233500
> gam1x1 <- gam(dam ~ s(lat, long) + s(f_edge),
+ family = binomial, data=data,
+ method = "ML", select = TRUE, na.action = "na.fail")
> # generate standadized residuals
> res <- residuals(gam1x1, type = "person")
> # calculate the sample variogram and fit the variogram model
> var <- variogram(res ~ long + lat, data=data)
> fit <- fit.variogram(var, vgm(c("Sph", "Exp"))); plot(var, fit)
> # extract the nugget and range from the variogram model to create the correlation structure
> # to include in the magic formula
> fit
model psill range
1 Nug 0.5324180 0.000
2 Sph 0.2935823 6002.253
> cs1Exp <- corSpher(c(6002.253, 0.5324180), form = ~ long + lat, nugget=TRUE)
> cs1Exp <- Initialize(cs1Exp, dataPOD1x1s)
> # follow instructions from mgcv 'magic' function
> V <- corMatrix(cs1Exp)
> Cv <- chol(V)
> ## gam ignoring correlation
> b <- gam(dam ~ s(lat, long) +
+ s(f_edge),
+ family = binomial,
+ data=data,
+ method = "ML",
+ select = TRUE,
+ na.action = "na.fail")
> ## Fit smooth, taking account of *known* correlation...
> w <- solve(t(Cv)) # V^{-1} = w'w
> ## Use `gam' to set up model for fitting...
> G <- gam(dam10_15 ~ s(lat, long) +
+ s(f_edge),
+ family = binomial,
+ data=dataPOD1x1s,
+ method = "ML",
+ select = TRUE,
+ na.action = "na.fail",
+ fit=FALSE)
> ## fit using magic, with weight *matrix*
> mgfit <- magic(G$y,G$X,G$sp,G$S,G$off,rank=G$rank,C=G$C,w=w)
> mg.stuff <- magic.post.proc(G$X,mgfit,w)
> b$edf <- mg.stuff$edf;b$Vp <- mg.stuff$Vb
> b$coefficients <- mgfit$b
> summary(gam1x1)
Family: binomial
Link function: logit
Formula:
dam ~ s(lat, long) + s(f_edge)
Parametric coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.6634 0.1338 -27.37 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df Chi.sq p-value
s(lat,long) 15.208 29 83.35 2.70e-14 ***
s(f_edge) 3.176 9 54.05 2.85e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.0595 Deviance explained = 15.6%
-ML = 551.92 Scale est. = 1 n = 3355
> summary(b)
Family: binomial
Link function: logit
Formula:
dam ~ s(lat, long) + s(f_edge)
Parametric coefficients:
Estimate Std. Error z value Pr(>|z|)
[1,] 0.06537 0.01131 5.778 7.56e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df Chi.sq p-value
s(lat,long) 3.716 29 0.046 1
s(f_edge) 6.753 9 0.295 1
R-sq.(adj) = 0.0618 Deviance explained = 15.6%
-ML = 551.92 Scale est. = 1 n = 3355
plot(b, select=2)
plot(gam1x1, select =2)
> resMag <- residuals.gam(b, type = "person")
> varMag <- variogram(resMag ~ long + lat, data=dataPOD1x1s)
> fitMag <- fit.variogram(varMag, vgm(c("Sph", "Exp"))); plot(varMag, fitMag)
> fitMag
model psill range
1 Nug 0.5324180 0.000
2 Sph 0.2935823 6002.253
Is something wrong in my script? Is there any other alternative to remove the residuals' spatial autocorrelation from my binomial gam?
Thanks a lot!