1

I want to use replicated spatial point patterns to perform hypothesis tests in spatstat. spatstat has wonderful documentation, and you can find details about replicated point pattern analysis here: https://cran.r-project.org/web/packages/spatstat/vignettes/replicated.pdf

Several mapped forest plots will represent a point process working at different locations. Each point pattern is marked, where each mark represents a tree species. Additionally, each point pattern is associated with a covariate raster image. First, I want to create a null model which assumes that each mark has a different relationship with the covariate. Then I want to use this null model to test whether certain species are associated with (or avoid) one another by simulating the random labeling hypothesis and plotting the resulting simulation envelopes. (The random labeling hypothesis states that the marks are randomly assigned to points in the point pattern.)

First, I'll show how I would normally do this analysis using a single point pattern. Then I'll explain the two problems I'm having using hyperframes to perform the same analysis.

Let's say you have a marked point pattern, where each mark is the tree species. I'll use the "lansing" data set available in spatstat:

library(spatstat)
data(lansing)
par(mar=rep(0.5,4))
plot(split(lansing),main="")

Now let's say you want to look at some spatial covariate (e.g., soil nutrients or moisture), so you create a raster image of a kernel smoothed density of the measurement:

sim1 <- rpoispp(function(x,y) {500 * exp(-3*x)}, win=owin(c(0,1),c(0,1)))
sim1 <- density(sim1)

First create null model:

single.mod <- ppm(lansing ~ marks*sim1)

Where the "ppm" function recognizes "marks" as the column with the species names, and "sim1" as the covariate.

Then perform a simulation-based hypothesis test, where you're interested in whether Black Oak and Maple are found in the same locations.

single.E<-envelope.ppm(single.mod,Lcross,i="blackoak",j="maple",nsim=39, nrank=1,global=TRUE,correction="best",simulate=expression(rlabel(lansing)))
par(mar=rep(4,4))
plot(E,legend=FALSE,ylab="L-function",xlab="Spatial scale (m)",main="Testing random label hypothesis \nfor single point pattern")

This works fine. Now if we go out and sample a couple more plots to make our analysis more robust, we can incorporate additional plots into a hyperframe, where each plot has its own point pattern and are considered "experimental" replicates. Each plot will get its own spatial covariate as well:

sim2 <- rpoispp(function(x,y) {500 * exp(-2*y)}, win=owin(c(0,1),c(0,1)))
sim2 <- density(sim2)

sim3 <- rpoispp(500, win=owin(c(0,1),c(0,1)))
sim3 <- density(sim3)

hyper <- hyperframe(pp=list(lansing,lansing,lansing),sims=list(sim1,sim2,sim3))

Sim2 and sim3 are spatial covariates for the additional two plots we've collected, and the "hyperframe" function combines the three point patterns with their associated spatial covariates into one hyperframe.

I'd like to build a model using "mppm" (used for creating models for multiple point patterns), where each point pattern is explained by their spatial covariates, "sims":

hyper.mod <- mppm(pp ~ sims, data = hyper)

The first problem arises when I try to allow each mark to carry a different relationship with the covariate:

int.mod <- mppm(pp ~ marks*sims, data=hyper)

The following error message spits out:

Error in checkvars(formula, data.sumry$col.names, extra = c("x", "y", : Variable "marks" in formula is not one of the names in data"

I get the same error using:

int.mod <- mppm(pp ~ pp$marks*sims, data=hyper)

The second problem is figuring out how to run the simulation-based hypothesis test on the hyperframe. Let's use the hyperframe model that worked (hyper.mod) to try this:

E.hyper <- envelope(hyper.mod,Lcross,i="blackoak",j="maple",nsim=39, nrank=1,global=TRUE,correction="best",simulate=expression(rlabel(pp)))

You get an error message:

Error in UseMethod("envelope") : no applicable method for 'envelope' applied to an object of class "c('mppm', 'list')"

Implying that "envelope" doesn't work on mppm objects (only ppp or ppm). I suspect there's a way to get around this limitation cleverly, but I haven't found it yet. Any suggestions or guidance would be very helpful!

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
A. Brown
  • 31
  • 4

2 Answers2

2

There is no envelope.mppm in spatstat because some of the statistical issues (related to multiple hypothesis testing) have yet not been resolved.

The fastest solution is probably to use cdf.test.mppm which will perform a test and give some graphical output.

It would only be reasonable to pool the envelopes (of the K functions, say) from different point patterns to obtain a single envelope provided the fitted model implies that the different patterns should be statistically equivalent. That would not be valid if, e.g., the model includes covariates which are different for different patterns.

A better strategy is probably to plot global envelopes for each pattern, and use the product rule for multiple testing. Suppose there are M point patterns in the data, and you want a test (of the fitted model) of significance level alpha (usually alpha = 0.05). Then you're going to construct M envelopes, one for each pattern, each with significance level gamma = 1 - (1-alpha)^(1/M). Each envelope will be generated by envelope.ppp with global=TRUE and nsim = 1/gamma - 1 approximately. Example: if M = 10 and alpha = 0.05 then gamma=1 - 0.95^(1/10) = 0.0051 so nsim=1/gamma-1 = 194.45, call it nsim=195. Since you're going to make global envelopes you may need twice that number of simulations, as explained the help for envelope. So do as follows, where fit is the fitted model and H is the original hyperframe of data:

sims <- simulate(fit, nsim=2*195)
SIMS <- list()
for(i in 1:nrow(sims)) SIMS[[i]] <- as.solist(sims[i,,drop=TRUE])
Hplus <- cbind(H, hyperframe(Sims=SIMS))

This augments the original hyperframe by adding a column of simulated patterns (each entry in the column is a 'solist' containing 2*195 patterns). Then do (where X is the column of H containing the original point pattern datasets)

EE <- with(Hplus, envelope(X, Lest, global=TRUE, nsim=195, simulate=Sims))
plot(EE)

This produces a plot with many panels of envelopes. Their interpretation is that, if any one of the observed L functions wanders outside the corresponding envelopes, the result is significant - the test rejects the null hypothesis that the fitted model is true.

Adrian Baddeley
  • 1,956
  • 1
  • 8
  • 7
  • Your code works fabulously, @Adrian Baddeley, thank you! If I wanted to test the random label hypothesis across the hyperframe, I'm assuming I could take a similar approach: `EE <- with(H, envelope.ppp(spp, Lest, global=TRUE, nsim=195, simulate=expression(rlabel(spp)))` Where I use the raw point pattern (spp) to simulate random marks on the pattern, and the correct number of simulations to account for multiple testing? – A. Brown Jul 07 '16 at 21:47
1

The first error is a bug that was fixed a few days ago in the development version of spatstat. If you have the devtools package you can get the latest (and greatest) spatstat easily:

devtools::install_github('spatstat/spatstat')

Let me know if this is not an option for you (and also which platform you are on).

The second error is indeed because envelope isn't implemented for the class mppm, so you have to devise a workaround for now as you say.

I think there are a couple of issues with what you have done so far: The model is inhomogeneous, but you use Lcross rather than Lcross.inhom and I think your single.E is equivalent to (where you don't use the fitted model at all):

single.E <- envelope(lansing, Lcross, i="blackoak", j="maple", nsim=39, rank=1, global=TRUE, correction="best", simulate=expression(rlabel(lansing)))

Let me know how you progress with this. I might find time to give some more details on a workaround for the missing envelope.mppm (pooling summary functions for each pattern).

Ege Rubak
  • 4,347
  • 1
  • 10
  • 18
  • Thank you, Dr. Rubak! Installing the new version of spatstat fixed the first issue. I'll continue to brainstorm about the second issue; although my newness to programming means it may be a long shot. – A. Brown Jul 02 '16 at 23:36
  • A very simple solution to problem #2 (performing hypothesis test via simulation envelopes on multiple point patterns) seems to be to take the mean (or weighted mean, weighted by sample size) of obs, mmean, hi, and lo (each of which is an output from the function "envelope"), and create a graph manually using those numbers. Is this an over-simplified solution? – A. Brown Jul 03 '16 at 02:10