0

I'm running a QAQC protocol and have a dataset of points with x and y coordinates and a polygon and want to determine wether the points fall into the polygon or not (points outside the polygon should be flagged as false values).

From the visual impression from the plot, one point is outside the polygon and all the other values are inside. point.in.polygon however gives the output, that no point is inside the polygon.

I figured my problem must be with either with the points in ggplot or geom_build, as I tried the point.in.polygon function extensively with my polygon and random point values.

Here is a reproductible example, hopefully somebody can point out where exactly I'm making a mistake.

The values for the polygon:

mean_aug_temp_ref=13.8
mean_aug_sd_ref=3.906242
sd_aug_temp_ref=3.083504
sd_aug_sd_ref=1.699699


#and these are my point values

data=data.frame("x"=c(16.17419355,16.79354839,16.37096774,15.7483871, 17.07741935,16.18387097,
  16.38064516,15.91612903,15,14.42580645,14.91935484,15.5,15.78709677,15.88709677,
  23.9,18.22258065,15.51612903,14.8516129,14.93548387,15.93225806,17.6483871,16.57741935,
  16.27419355,15.79354839,15.70322581,15.23548387,15.8516129,16.95483871,16.58064516,
  16.25806452,18.13225806,16.46774194,16.10645161,14.80322581,16.85806452,13.24516129,
  14.28387097,14.56451613),"y"=c(3.422182138,3.325302421,5.216263575,4.932097849,3.247799051,3.658370522,
  3.498499886,3.901150792,4.236607552,3.960090498,3.781208758,3.783591385,
  3.693390973,3.806386412,0.48730997,2.301078,3.721169197,4.045304928,3.684483053,
  3.41859195,2.901957554,3.415018251,3.466360853,3.79302042,3.739892688,4.178312743,
  4.041067269,2.901698087,2.832576457,3.230205585,3.063566527,3.068009,3.13238139,
  4.655432875,3.282535421,4.515352932,3.374136237,4.564639348))

test_object=ggplot(data=data, aes(x, y))+
  geom_point()+ #the point layer
  #ellipse for 5 times the sd for mean and sd of reference values
  geom_ellipse(aes(a=sd_aug_sd_ref*5, x0=mean_aug_temp_ref, b=sd_aug_temp_ref*5, y0=mean_aug_sd_ref, angle=0))

built <- ggplot_build(test_object)$data
points <- built[[1]] #first list element are the points
ell <- built[[2]] #second list element is the ellipse

dat <- data.frame(
  data[,1:2], #first two columns are the coordinates
  in.ell = as.logical(point.in.polygon(point.x=points$x, point.y=points$y, pol.x=ell$x, pol.y=ell$y)))

Dana
  • 13
  • 4
  • You seem to have missed out `mean_aug_temp_ref` so your code isn't reproducible. I'm guessing it's the same as mean(data$x), but if so, all the points are inside the ellipse on my plot. – Allan Cameron May 24 '20 at 12:39
  • Sorry, I added mean_aug_temp_ref. It's different from mean(data$x) as it is the mean value from a reference dataset. – Dana May 24 '20 at 13:01

1 Answers1

0

The problem occurs because of this line:

ell <- built[[2]]

If you examine this data frame, you'll see it actually has 38 copies of the ellipse in it (one for each point). This is an artefact of how geom_ellipse is created. So your "ellipse" is actually a 38-cycle loop. The solution is to filter out so you just get a single copy of the ellipse:

ell <- built[[2]][built[[2]]$group == built[[2]]$group[1],] 

dat <- data.frame(
        data[,1:2], #first two columns are the coordinates
        in.ell = as.logical(point.in.polygon(points$x, points$y, pol.x=ell$x, pol.y=ell$y)))

dat$in.ell
#>  [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#> [13]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#> [25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#> [37]  TRUE  TRUE

Where you can see all except the 15th element are inside the ellipse.

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • You are a genius, thank you so much! You definitely spared me another sleepless night! – Dana May 24 '20 at 13:48