Geodjango: how to load .shp file and convert to geojson with the right CRS?

Question

I have multiple shapefiles (.shp) with their auxiliary files that I want to display on a Leaflet map. The shapefiles use different coordinate reference systems (CRS) and I struggle to grasp the most straightforward and reliable way to show things on the map. In the geodjango tutorial, DataSource is used to load a shapefile and then manipulate it. However, in their examples, they only retrieve the geometry of individual features, not of the entire shapefile. I have used PyShp and I am able to show a map using something like:

    sf = shapefile.Reader(filename)
    shapes = sf.shapes()
    geojson = shapes.__geo_interface__
    geojson = json.dumps(geojson)

However, this fails when the CRS is not WGS84 things don't work, and I don't see how to convert it.

Reading a bit more, this post complains about CRS support and pyshp, and suggests using ogr2ogr.

So after trying to understand the options, I see using Datasource, pyshp, and ogr2ogr as possible options, but I don't know which option really makes the most sense.

All I want is convert a .shp file using Django into a geojson string that uses WGS84 so that I can include it on an HTML page that uses Leaflet.

Can anyone with more experience suggest a particular route?

John Moutafis · Answer 1 · 2020-06-28T08:27:38.890

There is not a straight forward way to read any shapefile using Django's DataSource and then translate it to EPSG:4326 (aka WGS84), therefore we need to create one step by step and solve the issues that arise as we get to them.

Let's begin the process:

Create a list of all the .shp file paths that you need to read in. That should look like this:

SHP_FILE_PATHS = [
    'full/path/to/shapefile_0.shp',
    'full/path/to/shapefile_1.shp',
    ...
    'full/path/to/shapefile_n.shp'
]

DataSource reads the shapefile into an object. The information is stored in the object's Layers (representing a multilayered shapefile) that are aware of their srs as a SpatialReference. That is important because we will transform the geometry later to WGS84 in order to be displayable on the map.
From each Layer of each shapefile, we will utilize the get_geoms() method to extract a list of OGRGeometry srs aware objects.
Each such geometry has a json method that:
Returns a string representation of this geometry in JSON format:
```
>>> OGRGeometry('POINT(1 2)').json 
'{ "type": "Point", "coordinates": [ 1.000000, 2.000000 ] }'
```
That is very useful because is half the solution to create a FeatureCollection type geojson that will be displayable to the map.

A FeatureCollection geojson has a very specific format and therefore we will create the basis and fill it procedurally:

feature_collection = {
    'type': 'FeatureCollection',
    'crs': {
        'type': 'name',
        'properties': {'name': 'EPSG:4326'}
    },
    'features': []
}

Finally, we need to populate the features list with the extracted geometries in the following format:

{
    'type': 'Feature',
    'geometry': {
        'type': Geometry_String,
        'coordinates': coord_list
     },
     'properties': {
         'name': feature_name_string
     }
}

Let's put all the above together:

for shp_i, shp_path in enumerate(SHP_FILE_PATHS):
    ds = DataSource(shp_path)
    for n in range(ds.layer_count):
        layer = ds[n]
        # Transform the coordinates to epsg:4326
        features = map(lambda geom: geom.transform(4326, clone=True), layer.get_geoms())
        for feature_i, feature in enumerate(features):
            feature_collection['features'].append(
                {
                    'type': 'Feature',
                    'geometry': json.loads(feature.json),
                    'properties': {
                        'name': f'shapefile_{shp_i}_feature_{feature_i}'
                    }
                }
            )

Now the feature_collection dict will contain the extracted feature collection transformed to epsg:4326 and you can create a json form it (ex. json.dump(feature_collection))

NOTE: Although this will work, it seems a bit counterproductive and you may consider reading the shapefiles into a model permanently instead of loading them on the fly.

Thanks! I've been trying to roll this out but the one issue is that my final json seems to be invalid. I think the problem is that inside each feature there is `'geometry': feature.json` -> this will return a json format, and then at the end I run json.dump on this same object so I end up with something like: `{ \"type\": \"LineString\", \"coordinates\": [ [ 22.509143, -11.06238335 ] .....` — , Jun 27 '20 at 10:28
@User402841 Yeah I forgot to add the `json.loads` call, I have updated the answer. — John Moutafis, Jun 28 '20 at 08:28
thanks, this works now! I also see you mention this seems counter productive and you suggest reading the shapefiles into a model permanently... let me explain: this is for a system where users can upload shapefiles, and _before_ they get stored in the db (which in our case is quite a complex operation) we need to make sure the user loaded the right file. To do so, I want to show a map on-screen (using leaflet), and I can not think of any other way to show this map than reading the shapefile and necessarily converting the CRS. If you know of another way, please do let me know! — , Jun 28 '20 at 08:37
@User402841 Sounds very interesting. As I think of it now, I would use a combination of reading the shapefile into a geopandas dataframe translate the dataframe's crs if needed and show the shapefile plot through an endpoint, into a Django template. I am not adding this to the answer as it would be out of scope. You can try and if you get stuck somewere you can make another question here on SO :) — John Moutafis, Jun 28 '20 at 13:19

Geodjango: how to load .shp file and convert to geojson with the right CRS?

1 Answers1