0

I have a shoreline shapefile of five great lakes and a georeferenced lakeshore tiff photo of lake Ontario. I want to overlay the shoreline shapefile into a tiff photo as a new raster layer. Specifically, I want to match pixels of the shapefile and tiff photo with the same coordinates. The result would be a m x n (dimension of one channel of tiff photo) raster layer where values are 1 if the coordinate matches the tiff file and 0 if not match. What should I do after reading those files as below? I can use basic gdal, rasterio, and geopandas in Python. I searched existing answers for a week but couldn't find one.

# read tiff file
ds = gdal.Open(path)
band = ds.GetRasterBand(1)
array = band.ReadAsArray()
# read shapefile
shoreline_delineation = gpd.read_file(shoreline_path)

Below are two images that I mentioned. Thank you very much! Shapefile Geotiff files

Chloe
  • 71
  • 1
  • 7
  • What do you mean by "matching"? You can rasterize the shapefile with `gdal.Rasterize` for example, and parse the output geographical properties from your raster. – Rutger Kassies Sep 13 '21 at 08:54
  • I mean add shoreline shapefile as a new raster layer to tiff photo. values are 1 where the coordinates of the tiff file are the same as shapefile and 0 if not the same. – Chloe Sep 13 '21 at 14:51

1 Answers1

1

Adding a rasterized version of the Shapefile can be done with something like the code below.

Assuming you have two input files, a raster and a vector, and it's assumed your vector is already of the type linestring (not polygons).

from osgeo import gdal

shp_file = 'D:/Temp/input_vector.shp'
ras_file = 'D:/Temp/input_raster.tif'
ras_file_overlay = 'D:/Temp/output_raster_overlay.tif'

Read the properties of the input raster:

ds = gdal.Open(ras_file)
gt = ds.GetGeoTransform()
proj = ds.GetProjection()
n_bands = ds.RasterCount
xsize = ds.RasterXSize
ysize = ds.RasterYSize
ds = None

Calculate the extent of the raster:

ulx, xres, _, uly, _, yres = gt

lrx = ulx + xres * xsize
lry = uly + yres * ysize

Rasterize the vector to an in-memory Geotiff, and read it as a Numpy array:

opts = gdal.RasterizeOptions(
    outputType=gdal.GDT_Byte,
    outputBounds=[ulx, lry, lrx, uly],
    outputSRS=proj,
    xRes=xres,
    yRes=yres,
    allTouched=False,
    initValues=0,
    burnValues=1,
)

tmp_file = '/vsimem/tmp'
ds_overlay = gdal.Rasterize(tmp_file, shp_file, options=opts)
overlay = ds_overlay.ReadAsArray()
ds_overlay = None
gdal.Unlink(tmp_file)

You can change the tmp_file to a location on-disk to write it as an intermediate output (single layer). That can be handy for debugging.

If needed, you can add a buffer to the shoreline to create a bigger/wider border in the output, by adding something like this to the gdal.RasterizeOptions above:

SQLStatement="SELECT ST_Buffer(geometry, 0.1) FROM <layer_name>",
SQLDialect="sqlite",

The Geotiff driver doesn't support the "AddBand" method, so you can't directly add it to your input raster. But adding it to a copy can be done with:

ds = gdal.Open(ras_file)

tmp_ds = gdal.GetDriverByName('MEM').CreateCopy('', ds, 0)
tmp_ds.AddBand()
tmp_ds.GetRasterBand(tmp_ds.RasterCount).WriteArray(overlay)

dst_ds = gdal.GetDriverByName('GTIFF').CreateCopy(ras_file_overlay, tmp_ds, 0)
dst_ds = None
ds = None

Alternatively, you could consider writing the rasterized layer directly to disk as a tiff, and then use something like gdal.BuildVRT to stack them together.

Rutger Kassies
  • 61,630
  • 17
  • 112
  • 97
  • Your code is VERY thorough and explanatory! But when I add ```SQLStatement="SELECT ST_Buffer(geometry, 0.1) FROM ", SQLDialect="sqlite",``` to gdal.RasterizeOptions, I got error ``` AttributeError Traceback (most recent call last) 16 ds_overlay = gdal.Rasterize(tmp_file, shp_file, options=opts) ---> 17 overlay = ds_overlay.ReadAsArray() 18 ds_overlay = None 19 gdal.Unlink(tmp_file) AttributeError: 'NoneType' object has no attribute 'ReadAsArray' ``` The code runs well without adding the SQL statement. – Chloe Sep 17 '21 at 01:07
  • @Chloe, did you change the `` part to your actual layer name? Usually the layer name of a Shapefile is equal to the filename without the extension (and path). The error your getting is downstream, GDAL probably printed an error in the command line/terminal you're using, hinting at this layer name issue. – Rutger Kassies Sep 17 '21 at 07:25
  • In my example above, the layer name would be `input_vector`. You could consider doing it automatically by using Python string formatting, combined with `os.path.basename(shp_file)`. – Rutger Kassies Sep 17 '21 at 07:29
  • Filename without the extension works! I used the name of the path to shapefile so it didn't work. – Chloe Sep 20 '21 at 20:20
  • Glad it worked! I now realize my last comment should have been: `os.path.splitext(os.path.basename(shp_file))[0]`. In order to also remove the extension from the filename. – Rutger Kassies Sep 21 '21 at 07:59