9

This question is similar to another one out there but none of the solutions worked for me. Note I have included several attempts at those solutions and results. If another library will achieve this I am open to it.

I am trying to expand a GeoJson file using GeoPandas where it contains multiple multi polygons.

Current geodataframe (3 Rows)

fill    fill-opacity    stroke  stroke-opacity  stroke-width    title   geometry
0   #9bf1e2 0.3 #9bf1e2 1   1   Hail Possible   (POLYGON ((-80.69500140880155 22.2885709067316...
1   #08c1e6 0.3 #08c1e6 1   1   Severe Hail (POLYGON ((-103.4850007575523 29.2010260633722...
2   #682aba 0.3 #682aba 1   1   Damaging Hail   (POLYGON ((-104.2750007349772 32.2629245180204...`

Desired geodataframe (200+ Rows)

fill    fill-opacity    stroke  stroke-opacity  stroke-width    title   geometry
0   #9bf1e2 0.3 #9bf1e2 1   1   Hail Possible   (POLYGON ((-80.69500140880155 22.2885709067316...
1   #9bf1e2 0.3 #9bf1e2 1   1   Hail Possible   (POLYGON ((-102.8150007766983 28.2180513479277...
2   #9bf1e2 0.3 #9bf1e2 1   1   Hail Possible   (POLYGON ((-103.4850007575523 29.0940821135748...
3   #9bf1e2 0.3 #9bf1e2 1   1   Hail Possible   (POLYGON ((-103.5650007552662 30.9947420843694...
4   #9bf1e2 0.3 #9bf1e2 1   1   Hail Possible   (POLYGON ((-103.6150007538374 31.0173836504729...

Sample File of geojson file being used: https://drive.google.com/file/d/1m6cMR4jF3QWp07e23sIdb0UF9xLD062s/view?usp=sharing

What I've Tried with no success:

df3.set_index(['title'])['geometry'].apply(pd.Series).stack().reset_index()

(Returns original unchanged gdf)

def cartesian(x): 
    return np.vstack(np.array([np.array(np.meshgrid(*i)).T.reshape(-1,7) for i in x.values]))
ndf = pd.DataFrame(cartesian(df3),columns=df3.columns)

(Returns original unchanged gdf)

import geopandas as gpd
from shapely.geometry.polygon import Polygon
from shapely.geometry.multipolygon import MultiPolygon

def explode(indata):
    indf = gpd.GeoDataFrame.from_file(indata)
    outdf = gpd.GeoDataFrame(columns=indf.columns)
    for idx, row in indf.iterrows():
        if type(row.geometry) == Polygon:
            outdf = outdf.append(row,ignore_index=True)
        if type(row.geometry) == MultiPolygon:
            multdf = gpd.GeoDataFrame(columns=indf.columns)
            recs = len(row.geometry)
            multdf = multdf.append([row]*recs,ignore_index=True)
            for geom in range(recs):
                multdf.loc[geom,'geometry'] = row.geometry[geom]
            outdf = outdf.append(multdf,ignore_index=True)
    return outdf

explode(GEOJSONFILE)

(Returns original unchanged gdf)

This is my first question on here so if any additional info or details are needed please let me know.

UPDATE: Found out the issue with the explode() function was due to a formatting issue on the file where the geometry was essentially a multi-polygon of multi-polygon causing a loop of only the first multi-polygon. The explode function works.

Adam Barrett
  • 188
  • 1
  • 2
  • 7

1 Answers1

18

You can use Geopandas explode().

exploded = original_df.explode()

copying from docstring:

    Explode muti-part geometries into multiple single geometries.

    Each row containing a multi-part geometry will be split into
    multiple rows with single geometries, thereby increasing the vertical
    size of the GeoDataFrame.
    The index of the input geodataframe is no longer unique and is
    replaced with a multi-index (original index with additional level
    indicating the multiple geometries: a new zero-based index for each
    single part geometry per multi-part geometry).

    Returns
    -------
    GeoDataFrame
        Exploded geodataframe with each single geometry
        as a separate entry in the geodataframe.
martinfleis
  • 7,124
  • 2
  • 22
  • 30
  • Thanks for the quick response. I tried this but it didn't work. I'm starting to wonder if this could be a data issue more than coding issue. – Adam Barrett Sep 30 '19 at 20:16
  • @AdamBarrett can you be more specific than "it didn't work"? You didn't get the expected result, and how did it differ? – joris Oct 01 '19 at 08:37
  • @joris The code executes but does't expand out the nested polygons in each row. Original shape of the geodataframe is (3,7) and after running 'exploded = df3.explode() exploded.shape' the new shape of exploded is still (3,7) – Adam Barrett Oct 02 '19 at 14:23
  • @AdamBarrett can you explain again what are you trying to do? In your data, all three MultiPolygons have only one component, so if exploded you have three polygons (correctly). There is nothing in the data which would allow it to be exploded to 200 rows. – martinfleis Oct 02 '19 at 15:57
  • is there an opposite method to `explode` ? something which would convert several polygons geometry to a single multipolygon geometry ? – Basile Apr 17 '20 at 12:28
  • 1
    @Basile Try looking at the dissolve function. I've used this in the past do combine several polygons by a common attribute. https://geopandas.org/aggregation_with_dissolve.html – Adam Barrett May 22 '20 at 16:21