The data is Citi Bikes NYC data from January 2019 to December 2019, which can be viewed here: https://s3.amazonaws.com/tripdata/index.html
You do not need to download the entire dataset you can download just one months
The following is an example of some of the columns of the data frame
start.station.latitude | start.station.longitude | end.station.latitude | end.station.longitude | usertype |
---|---|---|---|---|
40.77897 | -73.97375 | 40.78822 | -73.97042 | Subscriber |
40.75187 | -73.97771 | 40.74780 | -73.97344 | Customer |
The following is the code:
coordinates_table <- ridedata_clean %>% filter(start.station.latitude != end.station.latitude & start.station.longitude != end.station.longitude ) %>%
group_by(start.station.latitude,start.station.longitude,end.station.latitude,end.station.longitude,usertype) %>%
summarise(total = n(), .groups = "drop") %>% filter(total > 250)
Subscriber <- coordinates_table %>% filter(usertype == "Subscriber")
Customer <- coordinates_table %>% filter(usertype == "Customer")
nyc_bb <- c(left= -74.04, bottom = 40.93, right=-73.78, top =40.78)
nyc_stamen <- get_stamenmap( bbox = nyc_bb, zoom = 12, maptype = "toner")
ggmap(nyc_stamen, darken = c(0.8, "white")) +
geom_curve(Customer,
mapping = aes(x= start.station.longitude, y= start.station.latitude, xend = end.station.longitude,
yend = end.station.latitude, alpha = total, color =usertype), size = 0.5
, curvature =.2, arrow= arrow(length = unit(0.2,"cm"), ends = "first", type = "closed"))+
coord_cartesian()+labs(title = "most popular routes by Customers",
x=NULL,y=NULL,
caption = "Data by Citi Bikes and Map by ggmap ") +
theme(legend.position = "none")
The following is the error: I am getting the following error while running the above code : Coordinate system already present. Adding new coordinate system, which will replace the existing one. Error in grid.Call.graphics(C_raster, x$raster, x$x, x$y, x$width, x$height, : Empty raster