In this geopandas
example, the Antarctica land mass is dropped from the GeoDataFrame
before reprojecting to the Mercator projection, in order to prevent problem with shapes containing a pole (which would become infinitely large).
I was wondering, if it is possible to find a more robust reprojection method, so that the dataframe does not need to be manually adjusted. Especially since I'm working with a dataset that does not have a separate row for Antarctica:
I have two ideas:
1. Use information about the destination crs
On the epsg website for example, the 'Area of use' is shown:
We could use it to prepare the data before the reprojection: drop any shapes that extend further South than -80 deg, or alternatively intersect it with a shapely Polygon
that describes the area of use of the destination crs, in terms of the source crs - in this case the standard epsg:4326
so Polygon([(-180,-80), (-180,84), ...])
.
Issue with this approach: I'm not sure, if this area of use information is programmatically accessible from somewhere for any crs
, e.g. from the GeoDataFrame
object.
2. Fix in post
Just do it, and pick out the erroneously reprojcted parts later. In my current case, for example, the reprojected geodataframe gdf_merc = gdf.to_crs(epsg=3395)
does have errors...
...but by searching for the inf
word in the string representation of the geometry, I can find the offending Polygon
within the MultiPolygon
...
In [360]: for i, polygon in enumerate(gdf_merc.geometry[0]):
...: if 'inf' in str(polygon):
...: print(i)
0
...and just remove it:
Issue with this approach: seems elaborate and I'd prefer to prevent any problems from appearing in the first place.
Any thoughts on how to fix either of these methods, or is there a third way?
One remark: I'm interested in the general case, where any crs could be reprojected to, so I don't want to preemptively remove Antarctica ("just in case"), as other projections might be perfectly fine with it, and, more importantly, they might have other problem areas.
Many thanks!