0

I have obtained an mppm object by fitting a model on several independent datasets using the mppm function from the R package spatstat. I then want to compare the K function for the fitted model with the K function for the observations. However, it seems that the plots are not based on the pooled data, but on one single data set (as changing the order in which the data are read changes the plots).

I fitted my model as such:

data <- listof(NMJ1,NMJ2,NMJ3,NMJ4,NMJ5,NMJ6,NMJ7,NMJ8,NMJ9,NMJ10,NMJ11,NMJ12,NMJ13)
data <- hyperframe(X=1:13, Points=data)
model  <- mppm(Points ~marks*sqrt(x^2+y^2), data)

where NMJ1 to NMJ13 are marked ppp (with types GluR and BRP) and are independent realizations of the same experiment. I then used the following code to obtain the envelopes and K functions:

gamma= 1 - 0.95^(1/13)
nsims=round(1/gamma-1)
sims <- simulate(model, nsim=2*nsims)
SIMS <- list()
for(i in 1:nrow(sims)) SIMS[[i]] <- as.solist(sims[i,,drop=TRUE])
Hplus <- cbind(data, hyperframe(Sims=SIMS))

EE1 <- with(Hplus, envelope(Points, Kcross.inhom,funargs=list("GluR","GluR"), nsim=nsims, simulate=Sims,savefuns=TRUE))
EE2 <- with(Hplus, envelope(Points, Kcross.inhom,funargs=list("BRP","GluR"), nsim=nsims, simulate=Sims,savefuns=TRUE))
EE3 <- with(Hplus, envelope(Points, Kcross.inhom,funargs=list("BRP","BRP"), nsim=nsims, simulate=Sims,savefuns=TRUE))

plot(pool(EE1), main="Unc13A: GluR to GluR interactions", lwd = 5,xlim = c(0,15),legend=FALSE,cex.axis=1.5,cex.lab=1.5,ylab="K(r)", xaxt = "n",xlab='r [nm]')
axis(1, at=c(0,5,10,15), labels=c(0,5,10,15)*20,cex.axis=1.5)
r = 0:20
lines( r, pi*r**2, type="c", lwd = 5, col = "blue")

plot(pool(EE2), main="Unc13A: BRP to GluR interactions", lwd = 5,xlim = c(0,15),ylim = c(0,1000),legend=FALSE,cex.axis=1.5,cex.lab=1.5,ylab="K(r)", xaxt = "n",xlab='r [nm]')
axis(1, at=c(0,5,10,15), labels=c(0,5,10,15)*20,cex.axis=1.5)
lines( r, pi*r**2, type="c", lwd = 5, col = "blue")

plot(pool(EE3), main="Unc13A: BRP to BRP interactions", lwd = 5,xlim = c(0,15),legend=FALSE,cex.axis=1.5,cex.lab=1.5,ylab="K(r)", xaxt = "n",xlab='r [nm]')
axis(1, at=c(0,5,10,15), labels=c(0,5,10,15)*20,cex.axis=1.5)
lines( r, pi*r**2, type="c", lwd = 5, col = "blue")
legend(0, 580, c('Observed','Fitted','Poisson'), lwd = c(5,5,5), col = c('black','red','blue'), cex=1.5)

The result is the following plot: enter image description here

However, I observed that I sometimes had a significant offset between the black line (observations) and the red dashed one (fitted model), while I thought my mppm object precisely ought to fit the data in the best possible way. I realized that changing the order in which the data are being read (for instance, substituting NMJ1 for NMJ2 in the first line:

data <- listof(NMJ2,NMJ1,NMJ3,NMJ4,NMJ5,NMJ6,NMJ7,NMJ8,NMJ9,NMJ10,NMJ11,NMJ12,NMJ13)

modifies the red and black lines: it thus may be possible that they represent one single data set, and not the pooled data. Did I miss an argument in the plot function ?

1 Answers1

1

Please learn how to make a minimal reproducible example which will save everyone's time.

Like some of your previous questions, this issue is about understanding the pool command, rather than the plot command. When envelopes are pooled by the method pool.envelope, it is assumed that it would be valid to combine them. That means that each of the envelopes should have been computed (a) for exactly the same data, (b) by generating simulations from exactly the same null model. In your example, (a) is not true because each row of the hyperframe contains a different point pattern dataset; (b) is true for the particular model in your example, but might not be true for other models, depending on the arguments in the call to mppm. Since (a) does not hold, the results are not as expected.

In general, one cannot speak about "the K function" of a model fitted by mppm, because the model will be a different point process on each row of the hyperframe. However in your case, the model is the same point process on each row of the hyperframe, so the model has a well-defined K function, and you could estimate this K-function from the original data by pooling the individual K functions. So a valid way to get what you want is to first pool and then make envelopes. I will figure out how to do that for you, and post another message.

Judging by the first few lines of your code, I'm guessing that you would like to use a Bonferroni argument to construct a single diagnostic plot to serve as a goodness-of-fit test for the entire model. You could do that more easily by simply plotting the array of envelopes EE1 and rejecting the null hypothesis at the 5 percent level if any of the individual envelopes are deemed significant.

It's better to use global=TRUE and Lcross.inhom for testing purposes.

Adrian Baddeley
  • 2,534
  • 1
  • 5
  • 8