1

I am trying to find the centre coordinates of high-density areas in R.

The dataset I have has about 1.5million rows and looks like this (dummy data)

     LATITUDE LONGITUDE  val
      1 35.83111 -90.64639 359.1
      2 42.40630 -90.31810  74.5
      3 40.07806 -83.07806 115.4
      4 40.53210 -90.14730 112.0
      5 42.76310 -84.76220 118.4
      6 39.29750 -87.97460 134.4 ...
...

After plotting it using ggmap and ggplot using the command

ggmap(UK_Map) +  
geom_density2d(data=processedSubsetData,aes(x=processedSubsetData$Longitude,y=processedSubsetData$Latitude), bins=5) + 
    stat_density2d(data=processedSubsetData,aes(x=processedSubsetData$Longitude,y=processedSubsetData$Latitude,fill=..level.., alpha=..level..), geom='polygon')

I have the visualization which looks like below image. geospatial data As you can see from the image, there some high-density areas. I need to find the local centre coordinates of these high-density areas in the map.

I have tried calculating distance between the points and also rounding the coordinates to group them. But I am not able to make it work and is stuck. Thanks

hybrid
  • 1,255
  • 2
  • 17
  • 42
  • 3
    `stat_density2d` uses `MASS::kde2d`, which generates a `z` matrix. Identifying a single mode from that is easy; identifying multiple takes more creativity. – alistaire May 14 '18 at 04:59
  • 5
    Can you use clustering such as `kmeans` to create groups to separate them for center point determination? – r2evans May 14 '18 at 05:02
  • @alistaire Thanks for the quick message.I am a complete beginner in 'r' and I did this much by referring to blogs and StackOverflow. The task is to identify multiple centres. Is there any specific algorithm which I can follow for this creative approach? – hybrid May 14 '18 at 05:04
  • @r2evans I havent tried it. But I will def give it a try now. – hybrid May 14 '18 at 05:05
  • If you know the number of centers and they're reasonably differentiated, k-means is pleasantly simple, e.g. `broom::tidy(kmeans(faithful, 2))` – alistaire May 14 '18 at 05:32
  • @alistaire I don't know the number of centres I am after. I think k-means is not the ideal thing here. – hybrid May 14 '18 at 05:33
  • Then you're going to have to reach deeper into the [cluster analysis](https://en.wikipedia.org/wiki/Cluster_analysis) toolbox according to your requirements – alistaire May 14 '18 at 05:37
  • 1
    not related to your problem, but don't use the $ inside `aes` - ggplot can find the variables in the data.frame by itself – Richard Telford May 14 '18 at 10:24

0 Answers0