1

I am fairly new to R and programming in general, so I appreciate your input.

I'm trying to obtain the nearest neighbor distances for a set of points. My data consists of the scores on the first 4 Principal Components (PCs) of a PCA on morphological traits for different species, with several individuals of each species, for a total of 276 rows.

So, I created my ppx by combining one column for each of the PCs (PC1, PC2, PC3, PC4) and one column for my marks (the scientific name of each species). I extracted the coordinates from the original PCA object, by columns.

Multidimensional point pattern 
276 points 
4-dimensional space coordinates (PC1,PC2,PC3,PC4)
1 column of marks: ‘Species_marks’

I first tried obtaining the nndist between only the first and second PCs, but now I realize I want the distance in all 4 dimensions, returned by each of the species. What I want is the nearest neighbor of each individual, separated by species (so the nearest neighbor of each species).

Am I correct in interpreting the ouput of this line as what I am describing above?

PPX_nndistance <- nndist.ppx(My_PCA_Points, by=Species_marks)

The output is a matrix with the same number of points/rows (276) as my original data, and one column for each of the species, with the distance values between each point and its nearest neighbor of each species.

Can you confirm that this function is calculating the euclidian distance between the points in 4 dimensions?

Thanks so much for your help.

Phil
  • 7,287
  • 3
  • 36
  • 66
MJA
  • 11
  • 2

1 Answers1

1

Yes, when you use the argument by = "marks" in nndist() you get the distance from each data point to the nearest other point of each species specified by the marks of the pattern (in whatever dimension your points are – in this case dimension 4). Fully reproducible example with 10 points and two different mark labels:

library(spatstat)
co <- as.data.frame(matrix(runif(10*4), ncol = 4))
dat <- cbind(co, species = rep(c("a","b"), each = 5))
dat
#>            V1         V2         V3          V4 species
#> 1  0.63445421 0.02342485 0.81973509 0.747078710       a
#> 2  0.64659444 0.08671850 0.41461989 0.479899769       a
#> 3  0.98141590 0.91534392 0.11993177 0.225535950       a
#> 4  0.08795223 0.08288239 0.50009446 0.604776775       a
#> 5  0.10289099 0.48330815 0.68406240 0.035143386       a
#> 6  0.11829745 0.19811870 0.65028245 0.482867938       b
#> 7  0.32563624 0.55886364 0.17271701 0.001583984       b
#> 8  0.28140749 0.38131846 0.09478723 0.404689925       b
#> 9  0.14586923 0.86017444 0.20450944 0.786701343       b
#> 10 0.95229731 0.94235418 0.94513459 0.605581855       b
X <- ppx(dat)
X
#> Multidimensional point pattern
#> 10 points 
#> 4-dimensional space coordinates (V1,V2,V3,V4)
#> 1 column of marks: 'species'
nndist(X, by = "marks")
#>            a         b
#> 1  0.4895471 0.6288539
#> 2  0.4895471 0.5728002
#> 3  0.9748167 0.7810672
#> 4  0.5787884 0.2271969
#> 5  0.7203404 0.5321360
#> 6  0.2271969 0.6122530
#> 7  0.5638479 0.4494952
#> 8  0.5728002 0.4494952
#> 9  0.8532317 0.6369029
#> 10 0.9093800 1.1128385
Ege Rubak
  • 4,347
  • 1
  • 10
  • 18