4

I've found a shapefile for 4 digit postal codes in the Netherlands here:

https://public.opendatasoft.com/explore/dataset/openpostcodevlakkenpc4/export/?location=9,52.16172,5.56595&basemap=jawg.streets

What I'm trying to do is to combine postcodes that share the first two digits, and then plot them on a map.

Unfortunately there seems to be a problem with the data - there are some areas in the Netherlands that don't seem to be covered. As a result, when I combine the polygons to get the 2-digit polygons (from the 4-digit ones), there are holes in the resulting multipolygons.

I need to fill in these holes. I've seen posts for similar issues but nothing seems to do exactly what I need. In particular, I saw a post about using concave hulls, but this seems like overkill here.

There are several artifacts I've managed to fix (e.g. "slivers") but the holes remain.

Here's what I have so far:

from shapely.ops import cascaded_union
from shapely.geometry import JOIN_STYLE, Polygon, MultiPolygon

import os
import folium
import pandas as pd
import geopandas as gpd

GEOM_LOC = r"PATH_TO_FILE_ABOVE\openpostcodevlakkenpc4.shx"

# Get the data
geom = gpd.read_file(GEOM_LOC)

# Remove empty or nan records
is_empty = geom.geometry.is_empty
is_nan = geom.geometry.isna()
geom = geom[~(is_empty | is_nan)]

# Add field for 2 digit postcode
geom["digit"] = geom.pc4.apply(lambda x: x[0:2])
geom = geom[["digit", "geometry"]]

# Make dataframe for 2-digit zones
tags = list(set(geom["digit"]))
df = pd.DataFrame(tags, columns=["tag"])

# Function returning union of geometries
def combine_borders(*geoms):
    return cascaded_union([
        geom if geom.is_valid else geom.buffer(0) for geom in geoms
    ])

# Wrapping function above for application to dataframe
def combine(tag):
    sub_geom = geom[geom["digit"] == tag]
    bounds = list(sub_geom.geometry)
    return(combine_borders(*bounds))

# Apply the function
df["boundary"] = df.tag.apply(lambda x: combine(x))

# The polygons generated above do not fit perfectly together,
# resulting in artifacts. We fix that now
eps = 0.00001

def my_buffer(area):
    return(
        area.buffer(eps, 1, join_style=JOIN_STYLE.mitre).buffer(-eps, 1, join_style=JOIN_STYLE.mitre)
    )

df["boundary"] = df.boundary.apply(lambda x: my_buffer(x))

# Have a look at one of the holes
test = geom[geom.digit == "53"]

test = df.loc[df.tag == "53", "boundary"].item()
m = folium.Map(location=[51.8, 5.4], zoom_start=10, tiles="CartoDB positron")
m.choropleth(test)
m

As you see, I've tested on the two-digit zone "53". There is a hole that I need to fill:

enter image description here

I realise that the underlying geometry is quite complicated, but is there a straightforward way to fill this "hole"?

Many thanks for your help.

EDIT - 2020-04-25-16.14

For reference, here is the 2 digit postcode area "53" from wikipedia:

enter image description here

So it's not that there's a postcode-within-a-postcode - a postcode enclave.

EDIT - 2020-04-27

I found this post:

Convert Multipolygon to Polygon in Python

Applying the code

no_holes = MultiPolygon(Polygon(p.exterior) for p in m)

to my multipolygon closes the hole:

enter image description here

Frank
  • 223
  • 3
  • 11

1 Answers1

6

This closes simple holes in Polygon geometries of a GeoDataFrame.

def close_holes(poly: Polygon) -> Polygon:
        """
        Close polygon holes by limitation to the exterior ring.
        Args:
            poly: Input shapely Polygon
        Example:
            df.geometry.apply(lambda p: close_holes(p))
        """
        if poly.interiors:
            return Polygon(list(poly.exterior.coords))
        else:
            return poly

df = df.geometry.apply(lambda p: close_holes(p))
Christoph Rieke
  • 707
  • 5
  • 7
  • Thanks for the response Christoph. The issue here is that the shapes with holes are multipolygons and as such have no attribute "interiors". After further research looks like my question has been answered already: https://stackoverflow.com/questions/48082553/convert-multipolygon-to-polygon-in-python – Frank Apr 28 '20 at 08:33