-1

I need to make a scatter plot over a map. I have the shapefile to plot the map but I need to convert the information of shapefile (polygon) to lon and lat coordinates. This is my code:

borough = gpd.read_file('London_Borough_Excluding_MHW.shp')
borough.crs #{'init': 'epsg:27700'}
crs = {'init': 'epsg:27700'}
gdf = gpd.GeoDataFrame(
    df2, crs=crs, geometry=gpd.points_from_xy(df2.lat, df2.lon))

ax = gdf.plot(marker='*', markersize=0.2)
borough.plot(ax=ax)
plt.show

Now I get this plot: plot

My df2 looks like this: $type additionalProperties children childrenUrls commonName id lat lon placeType url

0   Tfl.Api.Presentation.Entities.Place, Tfl.Api.P...   [{'$type': 'Tfl.Api.Presentation.Entities.Addi...   []  []  River Street , Clerkenwell  BikePoints_1    51.529163   -0.109970   BikePoint   /Place/BikePoints_1
1   Tfl.Api.Presentation.Entities.Place, Tfl.Api.P...   [{'$type': 'Tfl.Api.Presentation.Entities.Addi...   []  []  Phillimore Gardens, Kensington  BikePoints_2    51.499606   -0.197574   BikePoint   /Place/BikePoints_2

My shapefile look like this:

NAME    GSS_CODE    HECTARES    NONLD_AREA  ONS_INNER   SUB_2009    SUB_2006    geometry
0   Kingston upon Thames    E09000021   3726.117    0.000   F   None    None    POLYGON ((516401.6 160201.8, 516407.3 160210.5...
1   Croydon E09000008   8649.441    0.000   F   None    None    POLYGON ((535009.2 159504.7, 535005.5 159502, ...
2   Bromley E09000006   15013.487   0.000   F   None    None    POLYGON ((540373.6 157530.4, 540361.2 157551.9...
Alexa
  • 99
  • 1
  • 6
  • you could load the shapefile using geopandas and reproject it to the proper CRS? http://geopandas.org/projections.html – Derek Eden Sep 02 '19 at 22:55
  • Thank you @DerekEden I did what you suggested, and now I have a different problem, the scatter points are not over the shapefile map. I updated the question just in case you could help me, Thanks – Alexa Sep 03 '19 at 22:30
  • what happens after you load the shp file and do df.crs? and what is the EPSG code for your desired map? You should just be able to do df['geometry'].to_crs(epsg=XXXX) – Derek Eden Sep 03 '19 at 22:55
  • @DerekEden I updated the code just in case you could help me. After doing df.crs I can plot th graph but I cant show the scatter plots on it. Thanks!! – Alexa Sep 03 '19 at 23:13
  • you need to do relace epsg = 27700 with the epsg of your desired CRS in the .to_crs command to convert your shapefile to the same coordinate system – Derek Eden Sep 04 '19 at 00:52
  • do you know the epsg code or CRS for the df2 points? – Derek Eden Sep 04 '19 at 00:53
  • either convert df to the same EPSG as your scatter points (using .to_crs command) or convert your scatter points to ESPG 27700 – Derek Eden Sep 04 '19 at 00:59
  • @DerekEden I tried what your suggested me and I got the plot I updated on my question. Could you please take a look and tell me what is wrong in my code? My plot is incorrect I think. Thanks a lot! – Alexa Sep 04 '19 at 06:25
  • you are defining you gdf with the points in a certain CRS but saying they are in epsg 27700..you need to replace 27700 with the espg that the point are in..THEN do .to_crs(epsg=27700) – Derek Eden Sep 04 '19 at 11:44
  • replace crs=crs with crs = {'init' : 'epsg:EPSG_OF_THE_POINTS'} when you define gdf...then do gdf.geometry = gdf.geometer.to_crs(epsg=27700) – Derek Eden Sep 04 '19 at 11:47
  • @Derek Eden my problem is I dont know the epsg of my points. I tried to get this by doing df2.crs and I got an error, "df2 does not have and attribute crs. How can I get the epsg of the lat and lon of my dataframe df2? Epsg = 27700 is the code of my shapefile. – Alexa Sep 04 '19 at 14:19
  • where did you get the points from? There might be some metadata about them from the source..it's likely that they are in EPSG 4326 (WGS1984)..in any case, you need to have the points/shapefile in the same CRS in order to plot them properly, otherwise there is no way to know how they line up spatially – Derek Eden Sep 04 '19 at 14:25
  • I got df2 from the API of Transport for London. It includes the Lat and lon of bike points. Is there any way to get the info? I am going to try with 4326. – Alexa Sep 04 '19 at 16:00
  • @DerekEden, it did not work... In df2 I have axes x and y in degrees but in borough I have 0, 1000000, 200000, do you think it might be the problem? – Alexa Sep 04 '19 at 16:48
  • yes that is exactly the problem...you have two datasets in two different CRS's and are trying to plot them together.. you need to have them both in the same CRS..you can either convert the points to EPSG 27700 or convert the shp file to the CRS of the points...either way you need to know the CRS of the points (which is likely epsg 4326, there must be metadata on the API website somewhere) assuming it is, you would then do: gdf = gpd.GeoDataFrame( df2, crs={'init':'epsg=4326'}, geometry=gpd.points_from_xy(df2.lat, df2.lon)) – Derek Eden Sep 04 '19 at 17:00
  • then gdf['geometry'] = gdf['geometry'].to_crs(epsg=27700) to convert it to the same CRS as the shp file – Derek Eden Sep 04 '19 at 17:00
  • @DerekEden I tried that and I got this error: ---------------------------- 360 # on case-insensitive filesystems). 361 projstring = projstring.replace('EPSG','epsg') --> 362 return _proj.Proj.__new__(self, projstring) 363 364 def __call__(self, *args, **kw): _proj.pyx in _proj.Proj.__cinit__() RuntimeError: b'no arguments in initialization list' – Alexa Sep 04 '19 at 17:06
  • @DerekEden I converted the borough borough = gpd.read_file('London_Borough_Excluding_MHW.shp') borough = borough.to_crs({'init': 'epsg:4326'}) fig = plt.figure(figsize=(20, 8)) ax = borough.plot() x, y = ax(df2['lon'].values, df2['lat'].values) ax.scatter(x,y, marker="*", color='r', alpha=0.7, zorder=5, s=9) plt.show... but I cant show the scatterplots .. TypeError: 'AxesSubplot' object is not callable – Alexa Sep 04 '19 at 17:57
  • can you send a link to the shapefile and the scatter points? – Derek Eden Sep 04 '19 at 19:53
  • @Derek Eden this is the link of the shapefile zip: https://data.london.gov.uk/dataset/statistical-gis-boundary-files-london and the scatter points are in the api which you can access here: import requests as rq r = rq.get('https://api.tfl.gov.uk/BikePoint?app_key=68180443ed4baffb6640824d8aa7db5c%09&app_id=2f7e332e') r = r.text df2 = pd.read_json(r) df2.head().. Thanks for helping me! – Alexa Sep 04 '19 at 21:52

1 Answers1

0

I just downloaded the shpfile and scatter data..you just need to properly convert the CRS

FYI the data source states that the scatter data is in WGS1984 (EPSG 4326)

import geopandas as gp
import pandas as pd

s = gp.read_file('London_Borough_Excluding_MHW.shp')
s['geometry']=s['geometry'].to_crs({'init':'epsg:4326'})

df2 = gp.GeoDataFrame(df, crs = {'init':'epsg:4326'}, geometry = gp.points_from_xy(df['lon'],df['lat'])) #where df is your df with the scatter data


ax = s.plot()
df2.plot(ax=ax,color='red')

result (I didn't plot all points, needed an API key so I just copied your snippet of the data)

enter image description here

EDIT - labelling boroughs (you'll have to format to make it prettier)

X = s['geometry'].representative_point().x 
Y = s['geometry'].representative_point().y 
L = s['NAME'] 
for x,y,l in zip(X,Y,L): 
    plt.text(x,y,l)
Derek Eden
  • 4,403
  • 3
  • 18
  • 31
  • thank you very much!!! One question more, could you tell me how to add Districts name on the map? Maybe using adjustText? – Alexa Sep 05 '19 at 00:10
  • there's probably another way but you can do: X = s['geometry'].representative_point().x Y = s['geometry'].representative_point().y L = s['NAME'] for x,y,l in zip(X,Y,L): plt.text(x,y,l) – Derek Eden Sep 05 '19 at 00:26
  • I got this error: File "", line 10 L = s['NAME'] for x,y,l in zip(X,Y,L): ^ SyntaxError: invalid syntax – Alexa Sep 05 '19 at 00:44
  • thanks a lot @Derek Eden. Do you know how can I paint in a different color a selected city on the map? – Alexa Sep 05 '19 at 17:38
  • make a cmap for all the polygons, or just re-plot a selected city after the fact as a different color http://geopandas.org/mapping.html – Derek Eden Sep 05 '19 at 18:42