14

I'm trying to combine multiple shapefiles by implementing the follwing:

import geopandas as gpd
import pandas as pd

for i in range(10,56):
    interesting_files = "/Users/m3105/Downloads/area/tl_2015_{}_arealm.shp".format(i)
    gdf_list = []
    for filename in sorted(interesting_files):
        gdf_list.append(gpd.read_file((filename)))
        full_gdf = pd.concat(gdf_list)

in which the directory /Users/m3105/Downloads/area has several shapefiles with such as tl_2015_01_arealm.shp, tl_2015_02_arealm.shp all the way up to tl_2015_56_arealm.shp. I'd like to combine all of these shapefiles and avoid repeating their headers. However, whenever I try concating the files using the code above, I get the following error:

ValueError: Null layer: u''

Normally, I'd know how to concat csv files together but I'm note sure how to concat shapefiles. I'd greatly appreciate any help

Paul H
  • 65,268
  • 20
  • 159
  • 136
M3105
  • 519
  • 2
  • 7
  • 20
  • 1
    There seem to be several errors in your code to be able to run: `interesting_files` is a single string, so looping through it with `for filename in sorted(interesting_files):` will loop through the single characters of that filename. Further, the `pd.concat(gdf_list)` should be outside of the for loop. – joris Feb 19 '18 at 22:15

3 Answers3

21

I can't test this since I don't have your data, but you want something like this (assuming python 3):

from pathlib import Path
import pandas
import geopandas

folder = Path("/Users/m3105/Downloads/area")
shapefiles = folder.glob("tl_2015_*_arealm.shp")
gdf = pandas.concat([
    geopandas.read_file(shp)
    for shp in shapefiles
]).pipe(geopandas.GeoDataFrame)
gdf.to_file(folder / 'compiled.shp')
Paul H
  • 65,268
  • 20
  • 159
  • 136
18

If using pandas.concat like the answer of @Paul H, some geographical imformation such as coordinate reference system(crs) does not get preserved by default. But it worked when using the way like below:

import os
import geopandas as gpd
import pandas as pd

file = os.listdir("Your folder")
path = [os.path.join("Your folder", i) for i in file if ".shp" in i]

gdf = gpd.GeoDataFrame(pd.concat([gpd.read_file(i) for i in path], 
                        ignore_index=True), crs=gpd.read_file(path[0]).crs)

In this way, the geodataframe will have CRS as your need

DrSnuggles
  • 217
  • 1
  • 14
lemon
  • 181
  • 1
  • 4
1

I've not enough rep to comment on last submission but after testing input files with differing CRS's,

gdf = gpd.GeoDataFrame(pd.concat([gpd.read_file(i) for i in path], 
                        ignore_index=True), crs=gpd.read_file(path[0]).crs)

should be

gdf = gpd.GeoDataFrame(pd.concat([gpd.read_file(i).to_crs(gpd.read_file(path[0]).crs) for i in path], 
                        ignore_index=True), crs=gpd.read_file(path[0]).crs)