2

I would like to perform regression kriging (RK) for binary presence-absence and host grid data as a constant predictor. I have used logistic function to estimate relationship between binary outcome and predictor, however I think it is not passing the RK assumptions? The predictor variable does not come out as a significant in the model. Is there any alternative how to approach it?

Data for the code: https://drive.google.com/folderview?id=0B7-8DA0HVZqDYk1BcFFwSkZCcjQ&usp=sharing

presabs <- read.csv("Pres_Abs.csv",header=T, 
                             colClasses = c("integer","numeric","numeric", 
                                            "integer"))

coordinates(presabs) <- c("Long","Lat")   # creates SpatialPointsDataFrame

host <- read.asciigrid("host.asc.txt")  # reads ArcInfo Ascii raster map 

host.ov <- overlay(host, presabs)  # create grid-points overlay
presabs$host.asc.txt <- host.ov$host.asc.txt  #copy host values
presabs$host.asc.txt <- log(host.ov$host.asc.txt)

glm(formula = Pres ~ host.asc.txt, family = binomial, data = presabs)
summary(glm.presabs)

Weighted Residuals:
    Min      1Q  Median      3Q     Max 
-0.3786 -0.3762 -0.3708 -0.3497  3.3137 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -1.942428   0.320104  -6.068 1.38e-08 ***
host.asc.txt -0.001453   0.003034  -0.479    0.633    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.007 on 127 degrees of freedom
Multiple R-squared:  1.317e-05, Adjusted R-squared:  -0.007861 
F-statistic: 0.001673 on 1 and 127 DF,  p-value: 0.9674

Then, when it comes to the actual kriging, I have build this code from a tutorial, but it seems the actual residuals from the glm are not fed into krige function. Can it be improved in gstat?

library(gstat)

# Set bin width for the variogram and max distance:
Bin <- 0.09
MaxDist <- 1
BinNo <- MaxDist/Bin

# Calculate and plot the variogram
surpts.var <- variogram(Pres~1, presabs, cutoff=MaxDist, width = Bin)
plot(surpts.var)

# Insert parameter values for the variogram model
psill = 0.05921
distance = 63.7/111
nugget = 0.06233 # constant

# Fit and plot variogram model:
null.vgm <- vgm(psill,"Sph",distance,nugget) # initial parameters
vgm_Pres_r <- fit.variogram(surpts.var, model=null.vgm, fit.ranges=TRUE, 
                            fit.method=1)
plot(surpts.var,vgm_Pres_r)

# Run RK using universal kriging:
presabs_uk <- krige(Pres~host.asc.txt, locations=presabs, 
                         newdata=host, model=vgm_Pres_r)
mihoo
  • 157
  • 3
  • 12
  • There are several problems with the first script: first, you forgot `library(sp)`, then the point data do not come in a convenient format on your google drive, then `presabs$host.asc.txt <- log(host.ov$host.asc.txt)` does not correspond to the glm output down below, then you seem to call `summary.lm` on a `glm` object instead of the appropriate `summary`, then `glm.presabs` is missing. Looks like you copy and pasted commands without checking they're consistent and reproducible. – Edzer Pebesma Apr 27 '15 at 13:41

1 Answers1

1

krige mentions that it is

[using universal kriging]

This means that it fits a linear model, but not a generalized linear model. It uses the variogram that you fitted to the raw data, not to the residuals. The residual variogram would have been obtained by

surpts.var <- variogram(Pres~host.asc.txt, presabs, cutoff=MaxDist, width = Bin)

but is nearly identical, since your variable and the grid map are nearly uncorrelated:

> cor(presabs$Pres,presabs$host.asc.txt)
[1] -0.04281038

so, it is not a suprise that you don't recognize the grid map in the universal kriging predictions: the two are nearly (linearly) independent.

Edzer Pebesma
  • 3,814
  • 16
  • 26