4

Getting this strange error while merging two regions using shapely's unary_union.

Shapely version: 1.6.4.post2

Python 3.5

Data

Polygons (side by side)

I want to add Gujranwala 1 and Gujranwala 2 to make it a single polygon.

Code

from shapely.ops import unary_union
polygons = [dfff['geometry'][1:2], dfff['geometry'][2:3]]
boundary = unary_union(polygons)

Output

    ---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-41-ee1f09532724> in <module>()
      1 from shapely.ops import unary_union
      2 polygons = [dfff['geometry'][1:2], dfff['geometry'][2:3]]
----> 3 boundary = unary_union(polygons)

~/.local/lib/python3.5/site-packages/shapely/ops.py in unary_union(self, geoms)
    145         subs = (c_void_p * L)()
    146         for i, g in enumerate(geoms):
--> 147             subs[i] = g._geom
    148         collection = lgeos.GEOSGeom_createCollection(6, subs, L)
    149         return geom_factory(lgeos.methods['unary_union'](collection))

~/.local/lib/python3.5/site-packages/pandas/core/generic.py in __getattr__(self, name)
   4374             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   4375                 return self[name]
-> 4376             return object.__getattribute__(self, name)
   4377 
   4378     def __setattr__(self, name, value):

AttributeError: 'GeoSeries' object has no attribute '_geom'
Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284
Ahsaan-566
  • 603
  • 3
  • 6
  • 21

1 Answers1

3

Your attempt at making the unary union sort of splits the difference between two ways that do work. The way you've attempted to select the two polygons (dfff["geometry"][1:2] and dfff["geometry"][2:3]) actually returns a pair of GeoSeries (which contains some sequence of shapely geometries), so you're passing unary_union a list of GeoSeries, whereas the unary_union function within shapely is expecting a list of shapely geometries. You could do:

polygons = [dfff.iloc[1, "geometry"], dfff.iloc[2, "geometry"]]
boundary = unary_union(polygons)

That said, GeoSeries provide their own unary_union method that just calls shapely.ops.unary_union, but does so over GeoSeries objects. So the easier way to get the unary union would be:

boundary = dfff["geometry"][1:3].unary_union

This also extends much more easily to a longer list of polygons.

jdmcbr
  • 5,964
  • 6
  • 28
  • 38
  • I tried your solution but now I'm getting key error. – Ahsaan-566 Sep 02 '18 at 08:55
  • This is might be dumb question but how do I pass unary_union a list of shapely geometries? because in this case union of two polygons is required. – Ahsaan-566 Sep 02 '18 at 09:01
  • My first example was passing a list of shapely geometries. Can you paste the key error you're getting? – jdmcbr Sep 03 '18 at 04:07
  • Ah, in the original version of the post, I didn't notice that you had a non-default index. When using `[a:b]` syntax, rows were selected based on location (i.e., `iloc`). When using `[k]` syntax, selection is based on labels (i.e., `loc`). I'll update my answer above to use the `iloc` syntax. – jdmcbr Sep 05 '18 at 20:05
  • @Ahsaan-566 see https://pandas.pydata.org/pandas-docs/stable/indexing.html for more information on indexing methods in pandas (which are used by geopandas). – jdmcbr Sep 05 '18 at 20:07
  • I tried your solution but it still gives me a Key error. After reading the documentation, I used **df.columns.get_loc("column")** and it worked. `[dfff.iloc[0, dfff.columns.get_loc("geometry")], dfff_rows.iloc[1, dfff.columns.get_loc("geometry")]]` – Ahsaan-566 Sep 13 '18 at 23:26