I am trying to get rid of the spatial geometry that falls outside of the shapefile boundary I read. Is it possible to do this without manual software like Photoshop? Or me manually removing the tracts which span outside of the city's boundries. For example, I took out 14 tracts, this is there result:
I have provided all of the subset of the data and the key to test it yourself. Code script is below, and the dataset is https://github.com/THsTestingGround/SO_geoSpatial_crop_Quest.
I have done st_intersection(gainsville_df$Geomtry$x, gnv_poly$geometry)
after I converted Geomtry
to the sf
, but I don't know what to do next to get rid of those portions.
library(sf)
library(tigris)
library(tidyverse)
library(tidycensus)
library(readr)
library(data.table)
#reading the shapefile
gnv_poly <- sf::st_read("PATH\\GIS_cgbound\\cgbound.shp") %>%
sf::st_transform(crs = 4326) %>%
sf::st_polygonize() %>%
sf::st_union()
#I have taken the "geometry" of latitude and longitude because it was corrupting my csv, but we can rebuild like so
gnv_latlon <- readr::read_csv("new_dataframe_data.csv") %>%
dplyr::select(ID,
Latitude,
Longitude,
Location) %>%
dplyr::mutate(Location = gsub(x= Location, pattern = "POINT \\(|\\)", replacement = "")) %>%
tidyr::separate(col = "Location", into = c("lon", "lat"), sep = " ") %>%
sf::st_as_sf(coords = c(4,5)) %>%
sf::st_set_crs(4326)
#then you can match the ID from gnv_latlon to
gainsville_df <- fread("new_dataframe_data.csv", drop = c("Latitude","Longitude", "Census Code"))
gainsville_df <- merge(gnv_latlon, gainsville_df, by = "ID")
#remove latitude and longitude points that fall outside of the polygon
dplyr::mutate(gainsville_df, check = as.vector(sf::st_intersects(x = gnv_latlon, y = gnv_poly, sparse = FALSE))) -> outliers_before
sf::st_filter(x= outliers_before, y= gnv_poly, predicate= st_intersects) -> gainsville_df
#Took out my census api key because of a feed back from a SO member. Please add a comment
#if you would like my census key.
#I use this function from tidycensus to retrieve the country shapfiles.
alachua <- tidycensus::get_acs(state = "FL", county = "Alachua", geography = "tract", geometry = T, variables = "B01003_001")
gainsville_df$Geomtry <- NULL
gainsville_df$Geomtry <- alachua$geometry[match(as.character(gainsville_df$`Geo ID`), alachua$GEOID)]
#gets us the first graph with bounry
ggplot() +
geom_sf(data = gainsville_df,aes(geometry= Geomtry, fill= Population), alpha= 0.2) +
coord_sf(crs = "+init=epsg:4326")+
geom_sf(data= gnv_poly) #with alpha added, we get the transparent boundary
Now I would like to get the second image without doing any future manual manipulation.
From this.....
Found this Compare spatial polygons and keep or delete common boundaries in R but the person here wanted to remove just the boundaries from one shapefile. And i tried to manipulate it to nothing.
EDIT Here is what I've tried after SymbolixAU direction, but my idx
variable is number from 1:7
fl <- sf::st_read("PATH\\GIS_cgbound\\cgbound.shp") %>% sf::st_transform(crs = 4326)
gainsville_df$Geomtry <- sf::st_as_sf(gainsville_df$Geomtry) %>% sf::st_transform(crs= 4326)
#normal boundry plot
plot( fl[, "geometry"] )
# And we can make a boundary by selecting some of the goemetries and union-ing them
boundary <- fl[ gnv_poly$geometry %in% gainsville_df$Geomtry, ]
boundary <- sf::st_union( fl ) %>% sf::st_as_sf()
## So now 'boundary' represents the area you want to cut out of your total shapes
## So you can find the intersection by an appropriate method
## st_contains will tell you all the shapes from 'fl' contained within the boundary
idx <- sf::st_contains(x = boundary, y = fl)
#doesn't work, thus no way of knowing the overlaps
#plot( fl[ idx[[1]], "geometry" ] )
#several more plots which i can't make sense of
plot( fl[ st_intersection(gainsville_df$Geomtry, gnv_poly$geometry), ])
plot(gainsville_df$Geomtry) #this just plots tracts