2

I have a dataset with the following coordinates:

lat <- c(-5.9, 35.9, 5.13, -3.4)
lon <- c(-19.9, -6, -39.9, 9.38)

So the latitude ranges from -5.9 – 35.9 whereas the longitude ranges from -39.9 – 9.38. The four lat/lot coordinates represent the area's boundaries (four corners).

What I would like to do is create a grid which splits/separates this range into 20 (or so) equal cells. I've tried the following method which splits the instances into equal parts and creates a new column assigning each split a number (1–20).

df$LAT_SPLIT <- NA
df$LAT_SPLIT <- as.numeric(cut2(df$LAT, g=20))

However, each split does not have the same coordinate range (degrees), which creates grids of different sizes. My question is, how would one separate the above coordinates to create a grid with equal cells, while creating a new column where each cell is assigned a number?

I've read different approaches such that each cell represents 1 degree change in latitude * 30 minutes of longitude but I'm not sure how to do that. I've tried to change the code above so that each degree change in latitude splits the latitude column but I couldn't quite figure out how to that that either, I believe you can use sequence? Even if I could get that to work, the longitude will still have different ranges..

I've tried this in R but any suggestions using Python are appreciated as well Really looking forward to any possible solutions, thanks!

Reproducible code

df <- structure(list(LAT = c(35.61226, 35.34986, 35.17794, 34.60425,34.40356, 33.94433, 33.41834, 16.89566, 16.89561, 16.89561),
                     LON = c(-9.604802, -9.803048, -9.921968, -10.30782, -10.44971,-10.76656, -11.13053, -24.99776, -24.99788, -24.99773)), 
                     .Names = c("LAT","LON"), class = "data.frame", row.names = c(1L, 2L, 3L, 4L, 5L,6L, 7L, 44161L, 44162L, 44163L))
loki
  • 9,816
  • 7
  • 56
  • 82
avgjoe13
  • 115
  • 2
  • 10
  • Are you looking for cut2from the Hmisc package or cut_interval from ggplot2? – MLEN Jul 31 '17 at 09:21
  • `seq(minvalue, maxvalue, length.out = 21)` will give you 20 equally spaced intervals. You could then feed this into `cut(values, breaks=seq(...))` to allocate your data into these intervals. – Andrew Gustar Jul 31 '17 at 09:25
  • Do you want _exactly_ 20 cells or do they need to have a specific resolution and/or be square sized? – Val Jul 31 '17 at 10:47
  • @Val Thanks for replying. I don't need 20 cells exactly but I would like them to all be the same size. I just need the range for the above coordinates (grid) to be divided into cells so that i can see how many coordinates (instances of lat/lon) belong to each cell. My df consists of about 45,000 instances with lat/lon coordinates. Some of those coordinates occur more often in certain areas which is why I need the cells so I can assign them to different cells. Just not sure what the best approach would be. If anyone knows how to code that it would be greatly appreciated! – avgjoe13 Jul 31 '17 at 21:17
  • @MLEN any approach that works is welcome, I'm not too picky, please see the comment above it may make what I'm trying to do a bit clearer, thanks – avgjoe13 Jul 31 '17 at 21:21
  • @AndrewGustar could you show me how you would do that for a 1 degree change in latitude so that each change in longitude is split equally as well for all the cells (thus creating equal sized grids)? Would be great, thanks. – avgjoe13 Jul 31 '17 at 21:27

1 Answers1

1

Here is an approach to generate a regular polygon grid.

First we start by converting the data.frame into a SpatialPointsDataFrame

library(sp)
dfSp <- SpatialPointsDataFrame(matrix(c(df$LON, df$LAT), nrow = nrow(df)), data = df)

Afterwards, we use makegrid to create a regular grid of central points and convert this into a SpatialPointsDataFrame as well.

grid <- makegrid(dfSp, n = 20)
gridSp <- SpatialPointsDataFrame(grid, data = data.frame(id = rownames(grid)))

In order to generate polygons, we utilize some raster:: functions. First, we create a RasterLayer which is then converted to polygons.

library(raster)
gridSpRas <- rasterFromXYZ(gridSp)
gridPoly <- rasterToPolygons(gridSpRas, dissolve = T)

This data also has a numeric identifier for each polygon (in this case gridPoly$layer):

str(gridPoly@data)
# 'data.frame': 30 obs. of  1 variable:
#   $ layer: num  19 20 21 22 24 14 15 16 17 18 ...

Let's have a look at the results:

plot(gridPoly)
points(dfSp, col = "red", pch = "+")

enter image description here

Usecase:

As an example, you can count the number of points within each polygon like this:

gridPoly$count <- unlist(lapply(1:length(gridPoly), 
                                function (x) {length(dfSp[gridPoly[x, ], ])}))
spplot(gridPoly, zcol = "count")

enter image description here

loki
  • 9,816
  • 7
  • 56
  • 82
  • @Ioki Thank you, works like a charm. Just have one additional request. How could I add the gridPoly$count column to my original df, so that each instance that belongs to a certain cell would be assigned a number, in this case from 1–30. This way I can group my data by the column "grid" for further analysis. – avgjoe13 Aug 01 '17 at 09:45
  • you can do this with a spatial join like in the *usecase*: `dfSp$count <- unlist(lapply(1:length(dfSp), function(x) {gridPoly[dfSp[x, ],]$count}))`. Since your question was about generating a grid, you might mark the answer as a solution to your Q to let future users know that this approach is valid. – loki Aug 01 '17 at 11:24
  • @Ioki marked it as answered, thank you! However, the code above to create a new column gave me the following error: replacement has 44157 rows, data has 44155. Any idea why? Ok, I changed the number of cells to 30 and it gave me 42 different cells (gives me 44155 rows), which is fine. But when I try to assign the cell id to each row the coordinates belong to it assigns the number of instances per row instead, which is really strange, and it takes forever, I tried changing $count to $id but that didn't work either.. – avgjoe13 Aug 02 '17 at 09:09
  • "the number of instances per row instead" seems to me as if you have some `length()` function there, as in the example in the answer. For spatial joining purposes you should have a look at other threads like [this](https://gis.stackexchange.com/q/137621/26687) or [this](https://gis.stackexchange.com/q/171856/26687). – loki Aug 02 '17 at 09:31
  • Thanks those threads helped me. For those trying to solve a similar problem, I added an id column to the polygons like so: `gridPoly@data$poly.ids <- 1:nrow(gridPoly)`, then combined the two using the following function from the spatialEco package: `pts.poly <- point.in.poly(dfSp, gridPoly)`. @Ioki do you happen to know how to add the cell id numbers to the grid? – avgjoe13 Aug 03 '17 at 18:23
  • I would think abut something like `gridPoly$id <- 1:nrow(gridPoly)`...? Imho, I don't really understand the Question. – loki Aug 03 '17 at 18:27
  • When you plot I would like to see the cell's "id" so that each cell has a number in the middle, is that possible? They do it in this thread with over, I've tried `over(gridPoly, gridSp)` which seems to work but I don't know how to use it.. thanks for your time @loki – avgjoe13 Aug 04 '17 at 08:27
  • For that, no spatial operation is necessary. you can use the centroids of the polygons and then plot the text. A quick google search brought [this](https://chemicalstatistician.wordpress.com/2013/03/02/adding-labels-to-points-in-a-scatter-plot-in-r/) up. Perhaps it helps. – loki Aug 04 '17 at 08:31
  • I'll take a look, thanks again @loki. If it doesn't work I might just start a new thread. Have to figure out how to overlay the grid to a map as well but I'm sure I'll manage. Not going to bother you any longer haha – avgjoe13 Aug 04 '17 at 08:51
  • Got it, you were right, all I had to do was this: `text(gridPoly, labels = "poly.ids")` – avgjoe13 Aug 04 '17 at 09:21
  • Have to ask one last question.. I can't figure out how to overlay the grid to a map, I've tried it with maptools and RgoogleMaps. I've looked at every thread out there, such as [this](https://gis.stackexchange.com/questions/88830/overlay-a-spatial-polygon-with-a-grid-and-check-in-which-grid-element-specific-c) and [this](https://gis.stackexchange.com/questions/210092/plotting-square-grids-on-a-map-and-extracting-each-grid-information-using-r) one but I can't figure it out. How would I overlay the grid on this map `map(database= "world", ylim=c(-5.9, 35.9), xlim=c(-39.9, 9.38))` ? – avgjoe13 Aug 08 '17 at 11:11
  • Have a look at [this](https://stackoverflow.com/a/34578958/3250126). Just ignore the raster part. This is how i plot data on base maps. – loki Aug 08 '17 at 11:34
  • Ok I've tried it and I do get the layers just like you did in your example but what I want are the cells (with id) and the plotted points and behind it a map, can be a very simple map with just the country borders. I'm aware that this may be too much to ask, if you think it's better to just start a new thread let me know, thanks again @loki – avgjoe13 Aug 08 '17 at 13:05