1

I am opening and using a netcdf file that is located on s3. I have the following code, however it creates an exception.

import s3fs
import xarray as xr

filepath = "s3://mybucket/myfile.nc"
fs = s3fs.S3FileSystem()

with fs.open(filepath) as infile:
    print("opening")
    ds = xr.open_dataset(infile, engine="h5netcdf")
    print(ds)
    print("done")

with fs.open(filepath) as infile:
    print("opening")
    ds = xr.open_dataset(infile, engine="h5netcdf")
    print(ds)
    print("done")

On the second ds = xr.open_dataset(infile, engine="h5netcdf") I get an exception: "I/O operation on closed file."

Why?

I found that by putting a ds.close() in between the two sections, it's ok. So that implies that even though infile was closed when the with block ended, ds still had it locked for exclusive use?

However, additionally, in between the two blocks, I tried print(ds["variable_name"].values) and also got the "operation on closed file" exception, which isn't surprising as the file is closed and the data was lazily loaded, but again it raises the question of why the second attempt to open_dataset fails.

Andrew Gaul
  • 2,296
  • 1
  • 12
  • 19
Scott
  • 85
  • 6

1 Answers1

0

the netcdf library places a lock on file objects opened for reading, and this can include fs objects. Opening the netCDFs in a context manager, or as you point out, explicitly closing the file objects, will resolve the issue:

filepath = "s3://mybucket/myfile.nc"
fs = s3fs.S3FileSystem()

with fs.open(filepath) as infile:
    print("opening")
    with xr.open_dataset(infile, engine="h5netcdf") as ds:
        print(ds)
        print("done")

with fs.open(filepath) as infile:
    print("opening")
    with xr.open_dataset(infile, engine="h5netcdf") as ds:
        print(ds)
        print("done")
Michael Delgado
  • 13,789
  • 3
  • 29
  • 54
  • ok, so there is a file object and an xarray object, and *both* must be open in order to load data and *both* should be closed afterwards, correct? I just rarely see this happening in examples/tutorials, they typically just get opened and forgotten. – Scott Jun 21 '22 at 23:03
  • totally. you can also *not* use context managers for both. but if you're going to open the files multiple times, you should make sure they're closed, and using a context manager for fs but not for xarray can lead to the types of errors you're seeing in my experience. I'm not sure if this just a bad interaction between the modules (e.g. netCDF improperly hanging on to dummy file handlers or something). – Michael Delgado Jun 21 '22 at 23:10