0

My goal here is to make a geodataframe from a couple of columns of coordinates in an existing dataframe, take those 1677 geographic points and add a buffer circle around each, then union the resulting polygons into a multipolygon. Where I keep getting wrapped around the axle is the .buffer() part of geopandas doesn't seem to be using the units of measure for the CRS I've selected.

In  []: ven_coords

Out []:     VenLat      VenLon
       0    42.34768    -71.085359
       1    42.349014   -71.081096
       2    42.347627   -71.081685
       3    42.348718   -71.077984
       4    42.34896    -71.081467
     ...         ...           ...
    1672    42.308962   -71.073516
    1673    42.313169   -71.089027
    1674    42.309717   -71.08247
    1675    42.356336   -71.074386
    1676    42.313005   -71.089887
    1677 rows × 2 columns

In  []: ven_coords_gdf = geopandas.GeoDataFrame(ven_coords, 
                                        geometry=geopandas.points_from_xy(ven_coords.VenLon, ven_coords.VenLat))
        ven_coords_gdf

Out []: VenLat  VenLon  geometry
       0    42.34768    -71.085359  POINT (-71.08536 42.34768)
       1    42.349014   -71.081096  POINT (-71.08110 42.34901)
       2    42.347627   -71.081685  POINT (-71.08168 42.34763)
       3    42.348718   -71.077984  POINT (-71.07798 42.34872)
       4    42.34896    -71.081467  POINT (-71.08147 42.34896)
     ...         ...           ...                        ...
    1672    42.308962   -71.073516  POINT (-71.07352 42.30896)
    1673    42.313169   -71.089027  POINT (-71.08903 42.31317)
    1674    42.309717   -71.08247   POINT (-71.08247 42.30972)
    1675    42.356336   -71.074386  POINT (-71.07439 42.35634)
    1676    42.313005   -71.089887  POINT (-71.08989 42.31300)
    1677 rows × 3 columns

So far so good, let's see what sort of thing I got back:

In  []: print('Type:', type(ven_coords_gdf), "/ current CRS is:",ven_coords_gdf.crs)

Out []: Type: <class 'geopandas.geodataframe.GeoDataFrame'> / current CRS is: None

It has no CRS, so I assign it the one relevant to what I'm working on:

In  []: ven_coords_gdf.crs = ("epsg:2249")
        print('Type:', type(ven_coords_gdf), "/ current CRS is:",ven_coords_gdf.crs)

Out []: Type: <class 'geopandas.geodataframe.GeoDataFrame'> / current CRS is: epsg:2249

It appears to have "taken" the CRS I added, and just to double-check, let's take a look at the details for the CRS in question:

In  []: CRS.from_epsg(2249)

Out []: <Projected CRS: EPSG:2249>
        Name: NAD83 / Massachusetts Mainland (ftUS)
        Axis Info [cartesian]:
        - X[east]: Easting (US survey foot)
        - Y[north]: Northing (US survey foot)
        Area of Use:
        - name: United States (USA) - Massachusetts onshore - counties of Barnstable; Berkshire; Bristol; Essex; Franklin; Hampden; Hampshire; Middlesex; Norfolk; Plymouth; Suffolk; Worcester.
        - bounds: (-73.5, 41.46, -69.86, 42.89)
        Coordinate Operation:
        - name: SPCS83 Massachusetts Mainland zone (US Survey feet)
        - method: Lambert Conic Conformal (2SP)
        Datum: North American Datum 1983
        - Ellipsoid: GRS 1980
        - Prime Meridian: Greenwich

2249 uses the U.S. Survey Foot as it's unit of measure, so I'll set my buffer at 1000 to get a 1000 foot radius from each of the points in my data:

In  []: ven_coords_buffer = ven_coords_gdf.geometry.buffer(distance = 1000)
        ven_coords_buffer

Out []: 0       POLYGON ((928.915 42.348, 924.099 -55.669, 909...
        1       POLYGON ((928.919 42.349, 924.104 -55.668, 909...
        2       POLYGON ((928.918 42.348, 924.103 -55.670, 909...
        3       POLYGON ((928.922 42.349, 924.107 -55.668, 909...
        4       POLYGON ((928.919 42.349, 924.103 -55.668, 909...
                                     ...                        
        1672    POLYGON ((928.926 42.309, 924.111 -55.708, 909...
        1673    POLYGON ((928.911 42.313, 924.096 -55.704, 909...
        1674    POLYGON ((928.918 42.310, 924.102 -55.707, 909...
        1675    POLYGON ((928.926 42.356, 924.110 -55.661, 909...
        1676    POLYGON ((928.910 42.313, 924.095 -55.704, 909...
        Length: 1677, dtype: geometry

Those coordinates are just a wee bit off. Clearly the buffer applied itself as a 1000°, not 1000ft, resulting in a glob of 1677 massive overlapping circles that cover the entire globe. Not quite what I'm looking for. Obviously I'm missing something, any suggestions?

As with any fun code problem, I swear it worked earlier, honest. I futzed around for a while before I finally got it to output the right thing, then I shut it down, went to dinner, came back and re-ran it, and got the above. The obvious deduction is that something I'd done in the aforementioned futzing around had been key to getting it to work, some re-used variable or whatever, but I can't figure out what's missing in the code above.

GeoPandas 0.9.0, pyproj 3.0.1

screenshot from happier times when it worked and I got it onto a map

Rico
  • 3
  • 4

1 Answers1

2

GeoPandas does exactly what is expected to do. You have to re-project your geometries to a target CRS, simply assigning it does not do anything.

When creating the GeoDataFrame, make sure you specify in which CRS your data is. In this case it is EPSG:4326 aka geographical projection in degrees.

ven_coords_gdf = geopandas.GeoDataFrame(ven_coords, 
                                        geometry=geopandas.points_from_xy(ven_coords.VenLon, ven_coords.VenLat),
                                        crs=4326)

Once properly set, you have to reproject (transform) your coordinates to a target CRS using to_crs.

ven_coords_gdf_projected = ven_coords_gdf.to_crs("epsg:2249")

Now you can use the buffer in feet. If you want to store the result in 4326 again, you just reproject it back using to_crs(4326).

I swear it worked earlier, honest.

I am pretty sure it did not :).

martinfleis
  • 7,124
  • 2
  • 22
  • 30
  • That did the trick, thank you! But it really did work previously (hence the map screenshot, which I'd retained as proof to myself that the bloody thing had worked at least once). Combing thru the tea leaves of things I'd tried and then commented out, I think in the aforementioned earlier futzing about I'd set the projection via `pyproj.Proj('epsg:4326')` just prior to creating the GDF. When that didn't work (due to some other error further down the line) I'd incorrectly assumed that part wasn't necessary/useful and moved on. The hazards of futzing about... – Rico May 05 '21 at 22:49