0

I am storing a .nc file on Amazon S3 and I want to open it using rasterio.open(). I know that rasterio is supporting this feature by looking in the document: https://rasterio.readthedocs.io/en/latest/topics/datasets.html

Though, I am willing to combine two features listed in this page. I would like to have something like:

my_path = "netcdf:/s3://*/*/file.nc:variable"
open.restario(my_path)

I have tested to put the file into my local environment and to apply netcdf:/ as a prefix and :variable as a suffix, it works. Though, it feels like I'm not able to do that from S3. I am receiving this error:

RasterioIOError: Failed to parse NETCDF: prefix string into expected 2, 3 or 4 fields.

Thank you for your help!

Hugoz13
  • 77
  • 6
  • `netcdf:` and `s3:` are two different prefixes (to access two different storages) but it can use only one of them and probably you need only `s3://*/*/file.nc:variable`. OR maybe it can use `netcdf+s3:` similar to `zip+http:`. I can't test it. – furas Sep 01 '22 at 13:32
  • Unfortunately netcdf+s3: is not working. S3 alone is working but does not allow me to access the variable I want to.. – Hugoz13 Sep 01 '22 at 13:57
  • it seems you will have to write code which first download file to local folder. – furas Sep 01 '22 at 14:05
  • Are you able to open any netcdf files this way? The underlying netcdf library doesn’t support streaming reads. Also, it may be easier to load the data with xarray and then move to rasterio using the .rio accessor – Michael Delgado Sep 01 '22 at 15:15

1 Answers1

0

Try working around Rasterio's HREF parsing by passing:

"netcdf:/vsis3/<bucket_name>/<prefix>/<file_name>:<subdataset_name>"

I have not confirmed how much of the file Rasterio actually pulls into memory, but it does load the subdataset as the sole constituent of the 'rasterio.io.DatasetReader' object.

d48sp23
  • 11
  • 2