2

What I want to do : load a raster from an s3 bucket in memory and set its CRS to 4326 (it has no crs set)

What I have so far:

import boto3
import rasterio
from rasterio.crs import CRS

bucket = 'my bucket'
key = 'my_key'
s3 = boto3.client('s3')
file_byte_string = s3.get_object(Bucket=bucket,Key=key)['Body'].read()
with rasterio.open(BytesIO(file_byte_string), mode='r+') as ds:
  crs = CRS({"init": "epsg:4326"}) 
  ds.crs = crs

I have found the way to structure my code here

Set CRS for a file read with rasterio

It works if I give it a path to a local file but it does not work for bytestreams.

The error I get when I have '+r' mode:

rasterio.errors.PathError: invalid path '<_io.BytesIO object at 0x7fb4503ca4d0>'

The error I get when I have 'r' mode:

rasterio.errors.DatasetAttributeError: read-only attribute

Is there a way to load bytestream in r+ mode so that I can set/modify the CRS?

GStav
  • 1,066
  • 12
  • 20

1 Answers1

0

You can achieve this if you wrap your bytes in a NamedTemporaryFile. This and some alternatives are explained in the docs.

import boto3
import rasterio
from rasterio.crs import CRS
import tempfile

bucket = 'asdf'
key = 'asdf'


s3 = boto3.client('s3')
file_byte_string = s3.get_object(Bucket=bucket,Key=key)['Body'].read()

with tempfile.NamedTemporaryFile() as tmpfile:
    tmpfile.write(file_byte_string)
    with rasterio.open(tmpfile.name, "r+") as ds:
         crs = CRS({"init": "epsg:4326"}) 
         ds.crs = crs

An important limitation of this approach is that you have to download the whole file into memory from S3, as opposed to mounting the file remotely like this.

Gijs
  • 10,346
  • 5
  • 27
  • 38