1

I'm trying to use intake and the intake-xarray to open and store remote files. I have a minimized catalog file here:

/isibhv/projects/paleo_pool/boundary_conditions/ice_sheet_reconstructions/ice_sheet_reconstructions.yaml

It looks like this:

metadata:
  version: 1
sources:
  glac1d:
    description: The GLAC-1D Reconstruction 
    driver: netcdf
    args:
        urlpath: "https://sharebox.lsce.ipsl.fr/index.php/s/yfuUw91ruuJXroC/download?path=%2F&files=TOPicemsk.GLACD26kN9894GE90227A6005GGrBgic.nc"
    cache_dir: "{{ CATALOG_DIR }}/glac1d"
    cache: 
        - argkey: urlpath
          type: file

I can open the files in Python:

import intake
cat = intake.open_catalog("ice_sheet_reconstructions.yaml")
ds = cat.glac1d.read()

This all works wonderfully; and I get the file as I would expect it. However, the cache doesn't show up where I would expect. I would have guessed a new folder is made under:

/isibhv/projects/paleo_pool/boundary_conditions/ice_sheet_reconstructions/glac1d

Instead, I get something in my home directory.

Did I specify the cache directory incorrectly?

As a second question: is it possible to directly specify how the cached files should be called when they are saved?

Thanks! Paul

pgierz
  • 674
  • 3
  • 7
  • 14

1 Answers1

1

The location of the cache is specified by the config, which is a YAML file typically in ~/.intake/conf.yaml (key "cache_dir"), but can be elsewhere according to the INTAKE_CONF(_FILE) environment variable OR the metadata of the source, key "catalog_dir" (<- this may be incorrect?). The special value "catdir" means "in the directory where the catalog is".

However

With the appearance of caching in fsspec, the following will be possible:

sources:
  glac1d:
    description: The GLAC-1D Reconstruction 
    driver: netcdf
    args:
        urlpath: "filecache://sharebox.lsce.ipsl.fr/index.php/s/yfuUw91ruuJXroC/download?path=%2F&files=TOPicemsk.GLACD26kN9894GE90227A6005GGrBgic.nc"
        storage_options:
            target_protocol: https
            cache_storage: "{{ CATALOG_DIR }}/glac1d"

unfortunately, the required change is not yet in intake-xarray.

mdurant
  • 27,272
  • 5
  • 45
  • 74
  • Do you know who I should get in contact with to implement this in the Xarray extension? I would find this *extremely* useful... – pgierz Mar 23 '20 at 07:10
  • That would be me https://github.com/intake/intake-xarray/pull/62 – mdurant Mar 23 '20 at 12:27