0

Can you help figure out the best way to resolve the length mismatch error thrown by dotsInPolys? I think it is because there are NA's or NULLs or some funk in the polygon data that makes it too long. Here's code that reproduces the error. Ultimately, I want to plot multiple races using Leaflet, but I can't produce the lat/lon needed for the random dots at this point.

require(maptools)
require(tidycensus)

person.number.divider <- 1000

census_api_key("ENTER KEY HERE", install = TRUE)

racevars <- c(White = "B02001_002", #"P005003" 
              Black = "B02001_003", #Black or African American alone
              Latinx = "B03001_003"
)

nj.county <- get_acs(geography = "county", #tract
              year = 2015,
              variables = racevars,
              state = "NJ", #county = "Harris County",
              geometry = TRUE,
              summary_var = "B02001_001")

library(sf)
st_write(nj.county, "nj.county.shp", delete_layer = TRUE)

nj <- rgdal::readOGR(dsn = "nj.county.shp") %>%
  spTransform(CRS("+proj=longlat +datum=WGS84"))

nj@data <- nj@data %>% 
  tidyr::separate(NAME,
                  sep =",",
                  into = c("county", "state"))  %>%
  dplyr::select(estimat,variabl, GEOID, county) %>%
  spread(key = variabl, value = estimat) %>%
  mutate(county = trimws(county))


black.dots <- dplyr::select(nj@data, Black) / person.number.divider #%>%
black.dots <-   dotsInPolys(nj, as.integer(black.dots$Black), f="random")

# Error in dotsInPolys(nj, as.integer(black.dots$Black), f = "random") : 
# different lengths

length(nj) # 63 This seems too many, because I believe NJ has 21 counties.
length(black.dots$Black) # 21

This post (Advice on troubleshooting dotsInPolys error (maptools)) came close to helping me, but I couldn't see how to apply it to my case.

I can change the length of the nj spatialpolygonsdataframe by removing NA's and counties with a black pop greater than 0, but then the map doesn't plot multiple counties (maybe there is something wrong with the census download?).

dca
  • 594
  • 4
  • 18
  • I've mostly resolved the issue. The "spread" that reshaped the data was causing problems. Using `output = "wide"` in the tidycensus download gave a better output. – dca Oct 11 '18 at 20:13

1 Answers1

1

It looks like you might have gotten this figured out, but I wanted to share another approach that uses sf::st_sample() instead of maptools::dotsInPolys(). One advantage of this is that you don't need to convert the sf object you get from tidycensus to a sp object.

In the following example I split the census data by race into a list three sf objects then perform st_sample() on each element of the list (each race). Next, I recombine the sampled points into one sf object with a new race variable for each point. Finally, I use tmap to make a map, though you could use ggplot2 or leaflet to map as well.

library(tidyverse)
library(tidycensus)
library(sf)
library(tmap)

person.number.divider <- 1000

racevars <- c(White = "B02001_002", #"P005003" 
              Black = "B02001_003", #Black or African American alone
              Latinx = "B03001_003"
              )

# get acs data with geography in "tidy" form
nj.county <- get_acs(geography = "county", #tract
                     year = 2015,
                     variables = racevars,
                     state = "NJ", #county = "Harris County",
                     geometry = TRUE,
                     summary_var = "B02001_001"
                     )

# split by race
county.split <- nj.county %>% 
  split(.$variable)

# randomly sample points in polygons based on population
points.list <- map(county.split, ~ st_sample(., .$estimate / person.number.divider))

# combine points into sf collections and add race variable 
points <- imap(points.list, ~ st_sf(tibble(race = rep(.y, length(.x))), geometry = .x)) %>% 
  reduce(rbind)

# map!
tm_shape(nj.county) +
  tm_borders(col = "darkgray", lwd = 0.5) +
  tm_shape(points) +
  tm_dots(col = "race", size = 0.01, pal = "Set2")

I don't have enough rep to post the map image directly, but here it is.

mfherman
  • 392
  • 4
  • 16
  • This is a cool way to do it and I was able to reproduce your map flawlessly. I'm not sure exactly what goes on inside st_sample and st_sf, but I look forward to reading about those two functions. Thanks – dca Nov 21 '18 at 16:03
  • 1
    `st_sample` takes n random samples from the uniform distribution contained within each polygon in the `sf` object and in this case, `st_sf` creates a spatial collection of points (`sfc`) object for each of the three races. – mfherman Nov 26 '18 at 15:40