I have been provided with some customer data in Latitude, Longitude, and Counts format. All the data I need to create a ggplot heatmap is present, but I do not know how to put it into the format ggplot requires.
I am trying to aggregate the data by total counts within 0.01 Lat and 0.01 Lon blocks (typical heatmap), and I instinctively thought "tapply". This creates a nice summary by block size, as desired, but the format is wrong. Furthermore, I would really like to have empty Lat or Lon block values be included as zeroes, even if there is nothing there... otherwise the heatmap ends up looking streaky and odd.
I have created a subset of my data for your reference in the code below:
# m is the matrix of data provided
m = matrix(c(44.9591051,44.984884,44.984884,44.9811399,
44.9969096,44.990894,44.9797023,44.983334,
-93.3120017,-93.297668,-93.297668,-93.2993524,
-93.2924484,-93.282462,-93.2738911,-93.26667,
69,147,137,22,68,198,35,138), nrow=8, ncol=3)
colnames(m) <- c("Lat", "Lon", "Count")
m <- as.data.frame(m)
s = as.data.frame((tapply(m$Count, list(round(m$Lon,2), round(m$Lat,2)), sum)))
s[is.na(s)] <- 0
# Data frame "s" has all the data, but not exactly in the format desired...
# First, it has a column for each latitude, instead of one column for Lon
# and one for Lat, and second, it needs to have 0 as the entry data for
# Lat / Lon pairs that have no other data. As it is, there are only zeroes
# when one of the other entries has a Lat or Lon that matches... if there
# are no entries for a particular Lat or Lon value, then nothing at all is
# reported.
desired.format = matrix(c(44.96,44.96,44.96,44.96,44.96,
44.97,44.97,44.97,44.97,44.97,44.98,44.98,44.98,
44.98,44.98,44.99,44.99,44.99,44.99,44.99,45,45,
45,45,45,-93.31,-93.3,-93.29,-93.28,-93.27,-93.31,
-93.3,-93.29,-93.28,-93.27,-93.31,-93.3,-93.29,
-93.28,-93.27,-93.31,-93.3,-93.29,-93.28,-93.27,
-93.31,-93.3,-93.29,-93.28,-93.27,69,0,0,0,0,0,0,
0,0,0,0,306,0,0,173,0,0,0,198,0,0,0,68,0,0),
nrow=25, ncol=3)
colnames(desired.format) <- c("Lat", "Lon", "Count")
desired.format <- as.data.frame(desired.format)
minneapolis = get_map(location = "minneapolis, mn", zoom = 12)
ggmap(minneapolis) + geom_tile(data = desired.format, aes(x = Lon, y = Lat, alpha = Count), fill="red")