3

I have a problem with my heatmap, which displays the density LEVEL, but doesn't say anything about the density count. (how many points are in the same area for example).

My data is divided in more columns, but the most important ones are: lat,lon.

I would like to have something like this, but with "count" : https://stackoverflow.com/a/24615674/5316566, however when I try to apply the code he uses in that answer, my maximum-"level" density doesn't reflect my density count.( Intead of 7500 I receive for example 6, even if I have thousands and thousands of data concentrated). That's my code:

us_map_g_str <- get_map(location = c(-90.0,41.5,-81.0,42.7), zoom = 7)
ggmap(us_map_g_str, extent = "device") + 
geom_tile(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat)), size = 0.3) + 
stat_density2d(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat), fill = ..level.., alpha = ..level..), size = 0.3, bins = 10, geom = "polygon") + 
scale_fill_gradient(name= "Ios",low = "green", high = "red", trans= "exp") + 
scale_alpha(range = c(0, 0.3), guide = FALSE)

This is what I get:

enter image description here

This is part of the data:

  lat       lon       tag  device
1 43.33622 -83.67445   0 iPhone5
2 43.33582 -83.69964   0 iPhone5
3 43.33623 -83.68744   0 iPhone5
4 43.33584 -83.72186   0 iPhone5
5 43.33616 -83.67526   0 iPhone5
6 43.25040 -83.78234   0 iPhone5

(The "tag" column is not important)

Community
  • 1
  • 1
U.Cremona
  • 65
  • 10
  • Where would you want to add count? Would you be able to share a part of your data? – jazzurro Sep 11 '15 at 12:12
  • In the legend, instead of level. I want that R calculates the concentration and tells me how many points he can count in an area. – U.Cremona Sep 11 '15 at 13:27
  • I think you forgot one thing when you applied hs code : his data have a "Count" column. so you need to manipulate your data to have one too. – maeVeyable Sep 11 '15 at 14:33
  • It's not important how it's called the column in the dataframe, because I can always change it later – U.Cremona Sep 11 '15 at 14:58

1 Answers1

1

REVISED

I realised that my previous answer needs to be revised. So, here it is. If you want to find out how many data points exist in each level of a contour, you actually have a lot of things to do. If you are happy to use the leaflet option below, your life would be much easier.

First, let's get a map of Detroit, and create a sample data frame.

library(dplyr)
library(ggplot2)
library(ggmap)

mymap <- get_map(location = "Detroit", zoom = 8)

### Create a sample data
set.seed(123)
mydata <- data.frame(long = runif(min = -84, max = -82.5, n = 100),
                     lat = runif(min = 42, max = 42.7, n = 100))

Now, we draw a map and save it as g.

g <- ggmap(mymap) +
     stat_density2d(data = mydata,
                    aes(x = long, y = lat, fill = ..level..),
                    size = 0.5, bins = 10, geom = "polygon")

enter image description here

The real job begins here. In order to find out the numbers of data points in all levels, you want to employ the data frame, which ggplot generates. In this data frame you have data for polygons. These polygons are used to draw level lines. You can see that in the following image, which I draw three levels on a map.

### Create a data frame so that we can find how many data points exist
### in each level.

mydf <- ggplot_build(g)$data[[4]]

### Check where the polygon lines are. This is just for a check.

check <- ggmap(mymap) +
         geom_point(data = mydata, aes(x = long, y = lat)) +
         geom_path(data = subset(mydf, group == "1-008"), aes(x = x, y = y)) +
         geom_path(data = subset(mydf, group == "1-009"), aes(x = x, y = y)) +
         geom_path(data = subset(mydf, group == "1-010"), aes(x = x, y = y)) 

enter image description here

The next step is to reate a level vector for a legend. We group the data by group (e.g., 1-010) and take the first row for each group using slice(). Then, ungroup the data and choose the 2nd column. Finally, create a vector with unlist(). We come back to lev in the end.

mydf %>%
group_by(group) %>%
slice(1) %>%
ungroup %>%
select(2) %>%
unlist -> lev

Now we split the polygon data (i.e., mydf) by group and create a polygon for each level. Since we have 11 levels (11 polygons), we use lapply(). In the lapply loop, we need to do; 1) extract column for longitude anf latitude, 2) create polygon, 3) convert polygons to spatial polygons, 4) assign CRS, 5) create a dummy data frame, and 6) create SpatialPolygonsDataFrames.

mylist <- split(mydf, f = mydf$group)

test <- lapply(mylist, function(x){

              xy <- x[, c(3,4)]

              circle <- Polygon(xy, hole = as.logical(NA))

              SP <- SpatialPolygons(list(Polygons(list(circle), ID = "1")))

              proj4string(SP) <- CRS("+proj=longlat +ellps=WGS84")

              df <- data.frame(value = 1, row.names = "1")

              circleDF <- SpatialPolygonsDataFrame(SP, data = df)

            })

Now we go back to the original data. What we need to to is to convert the data frame to SpatialPointsDataFrame. This is because we need to subset data and find how many data points exist in each polygon (in each level). First, get long and lat from your data.frame. Make sure that the order is in lon/lat.

xy <- mydata[,c(1,2)]

Then, we create SPDF (SpatialPolygonsDataFrame). You want to have an identical proj4string between spatial polygons and spatial points data.

spdf <- SpatialPointsDataFrame(coords = xy, data = mydata,
                               proj4string = CRS("+proj=longlat +ellps=WGS84"))

Then, we subset data (mydata) using each polygon.

ana <- lapply(test, function(y){

              mydf <- as.data.frame(spdf[y, ])

            })

Data points are overlapping across levels; we have duplication. First we try to find out unique data points for each level. We bind data frames in ana and create a data frame, which is foo1. We also create a data frame, which we want to find unique number of data points. We make sure that columns names are all identical between foo1 and foo2. Using setdiff() and nrow(), we can find the unique number of data points in each level.

total <- lapply(11:2, function(x){

                foo1 <- bind_rows(ana[c(11:x)])
                foo2 <- as.data.frame(ana[x-1])
                names(foo2) <- names(foo1)
                nrow(setdiff(foo2, foo1))               
              })

Finally, we need to find the number of data points for the most inner level, which is level 11. We choose a data frame for level 11 in ana and create a data frame and count the number of row.

 bob <- nrow(as.data.frame(ana[11]))
 out <- c(bob,unlist(total))

 ### check if total is 100
 ### sum(out)
 ### [1] 100

We assign reversed out as names for lev. This is because we want to show how many data points exist for each level in a legend.

 names(lev) <- rev(out)

Now we are ready to add a legend.

 final <- g +
          scale_fill_continuous(name = "Total",
                                guide = guide_legend(),
                                breaks = lev)

 final

enter image description here

LEAFLET OPTION

If you use leaflet package, you can group data points with different zooms. Leaflet counts data points in certain areas and indicate numbers in circles like the following figure. The more you zoom in, the more leaflet breaks up data points into small groups. In terms of workload, this is much lighter. In addition, your map is interactive. This may be a better option.

library(leaflet)
leaflet(mydf) %>%
addTiles() %>%
addMarkers(clusterOptions = markerClusterOptions())

enter image description here

jazzurro
  • 23,179
  • 35
  • 66
  • 76
  • Your first edit didn't work so well, but the second method is perfect! Even better than a heatmap. Thank you, very appreaciated – U.Cremona Sep 14 '15 at 08:46
  • @U.Cremona I am glad to hear that the leaflet version worked for you. :) – jazzurro Sep 14 '15 at 09:49
  • do you know how I can custom the value shown in the markers? if I want them to show another column of the data, what do I have to do? – U.Cremona Sep 15 '15 at 14:43
  • @U.Cremona I am afraid that is beyond my knowledge right now. – jazzurro Sep 16 '15 at 14:44
  • no problem man ;) for anyone else who reads the comment I expressed myself badly before: i want to insert a formula which takes the data from a column and the markers have to show this the result – U.Cremona Sep 16 '15 at 15:04
  • @U.Cremona I revised my answer. If necessary, you can try the codes for ggmap and ggplot option. – jazzurro Oct 01 '15 at 13:21