the meaning of cluster size in Cox process models in spatstat

Question

for some tree wood, the conduits in cross sections clearly aggregate as clusters. it looks natural that the Cox process modeling in spatstat (r) could be fitted for the conduits point data, and the results include a estimated "Mean cluster size". I am not sure the meaning of this index, can I think it is the mean number of conduits in clusters of the whole conduit points data? code from an good example in the book is following:

    >fitM<-kppm(redwood~1, "MatClust")
    >fitM 
    #...    
    # Scale-0.08654
    # Mean cluster size: 2.525 points

in their book, author of the spatstat explain the mean cluster size as the offspring points number, which is dispered by parent points like plant seedlings. in my case, there are no such process happening: conduits are xylem cells developed from cambium cells from outside of the stem annual ring, they donnot disperse randomly. I would like to estimate the mean cluster size and cluster scale for my conduit distribution data, the Scale and Mean cluster size seems like what I want. however, the redwood data was different with mine in nature, I am not sure about the meaning of them in my data. futhermore, I am wondering, which model is suit for my context, NeymanScott, MatCluster, Thomas or others? any suggestion is appreciated. Jingming

score 1 · Answer 1 · answered Oct 09 '19 at 13:00

1

If you fit a parametric point process model such as a Thomas or Matern cluster process you are assuming the data is generated by a random process that generates a random number of clusters with a random number of points in each cluster. The location of the points around each cluster center is also random. The parameter kappa controls the expected number of clusters, mu controls the expected number of points in a cluster and scale controls the extend of the cluster. The type of process (Thomas, Matern or others) determines the distribution within the cluster. My best suggestion is to do simulation experiments to understand these different types of processes and see if they are appropriate for your needs.

For example on average 10 clusters in the unit square with on average 5 points in each and a short spatial extend (scale=0.01) of the cluster gives you fairly well-defined tight clusters:

library(spatstat)
set.seed(42)
sim1 <- rThomas(kappa = 10, mu = 5, scale = 0.01, nsim = 9)
plot(sim1, main = "")

For example on average 10 clusters in the unit square with on average 5 points in each and a bigger spatial extend (scale=0.05) of the cluster gives a less clear picture where it is hard to see the clusters:

sim2 <- rThomas(kappa = 10, mu = 5, scale = 0.05, nsim = 9)
plot(sim2, main = "")

In conclusion: Experiment with simulation and remember to do many simulations of each experiment rather than just one, which can be vey misleading.

answered Oct 09 '19 at 13:00

Ege Rubak

4,347
1
10
18

thanks Ege, glad to have your commments. if I cannot meet the first assuption, "Poisson parents", I should not use the cox process model. as generally in a growth ring, the early-wood conduits are bigger than late-wood conduits (spring cell grows bigger, maybe a little). the previous paper also used a heterogenous model to fit the data, perhaps in same reasoning. I am not sure whether cambium cell occured ramdomly, but the xylem cell (conduit) do present heterogenous spatial pattern. just thinking, maybe I should try some data simulation as you suggested. – Jingming Oct 11 '19 at 13:41
another question: what is the lowest point number in the a ppp project. I konw more than 100 points in a data is better, however, some species just can not meet this condition. i.e., their maybe only 30 or less conduits in a growth ring of a tree species. do you think this data can be fitted well? – Jingming Oct 11 '19 at 13:48
to me, the fitted parameters of mu, scale for each specie could be compared, so that I could konw which species have a more closely-contacted conduits (two neighbour conduits may transport water more efficiently than two sigle isolated conduits, which is the biological significance of this study) – Jingming Oct 11 '19 at 14:03
on a second thought, I am persuiting the fitted model parameters which is not defined by myself as your simulations. how to find a reliable model fit? can simulation help with this?. thanks . – Jingming Oct 11 '19 at 14:07
My point with the simulation is that it can help you understand how these models in principle work and the interpretation of the parameters. To fit such models to data you have to use `kppm`. I cannot help you find appropriate model and solve your actual scientific problem. I hope you can benefit from studying the spatstat book. – Ege Rubak Oct 11 '19 at 20:20
thanks for the quick response. I figure it out finally. – Jingming Oct 12 '19 at 15:50

the meaning of cluster size in Cox process models in spatstat

1 Answers1