I'm working on a project where I have a very large amount of points and I am looking to identify regions (defined by a lack of clustering) where the density of these points is statistically significantly less relative to others. Normally a visual would be enough but I have so many points that it is to difficult to tell where these empty spaces are and a density heat map doesn't help me zero in on smaller regions. Maybe I'm missing something very simpler here, but I am hoping someone can at least send me in the right direction of where to look. Below is a reproducible sample quick and dirty lets take these points from open data and map them to the borough file for NYC:
#libraries--------------------------
library(ggplot2)
library(ggmap)
library(sp)
library(jsonlite)
library(RJSONIO)
library(rgdal)
#call api data--------------------------
df = fromJSON("https://data.cityofnewyork.us/resource/24t3-xqyv.json?$query= SELECT Lat, Long_")
df <- data.frame(t(matrix(unlist(df),nrow=length(unlist(df[1])))))
names(df)[names(df) == 'X2'] = 'x'
names(df)[names(df) == 'X1'] = 'y'
df = df[, c("x", "y")]
df$x = as.numeric(as.character(df$x))
df$y = as.numeric(as.character(df$y))
df$x = round(df$x, 4)
df$y = round(df$y, 4)
df$x[df$x < -74.2] = NA
df$y[df$y < 40.5] = NA
df = na.omit(df)
#map data----------------------------
cd = readOGR("nybb.shp", layer = "nybb")
cd = spTransform(cd, CRS("+proj=longlat +datum=WGS84"))
cd_f = fortify(cd)
#map data
nyc = ggplot() +
geom_polygon(aes(x=long,
y=lat, group=group), fill='grey',
size=.2,color='black', data=cd_f, alpha=1)
nyc + geom_point(aes(x = x, y = y), data = df, size = 1)
#how would I go about finding the empty spaces? That is the regions where there are no clusters?
In this case there aren't a lot of points but for the sake of demonstration, how would I:
- identify pockets of low density
- potentially draw polygon boundaries on those pockets?
Appreciate the help!