1

I was wondering if there was a way to stream data directly from a NetCDF file as it's being written with `xarray.

I think I can "create" a non-buffered file like this?

import io
ts_file_stream = io.open("/some/file/being/written/to.nc", mode='rb', buffering=-1)

I also know that I can open this with Xarray:

import xarray as xr
ds = xr.open_dataset(ts_file_stream)

However, I am unsure if the arrays will then continually be updated? The purpose of this whole thing is as follows: I have a numerical model producing output, and I'd like to visualize some variables as the model runs to get a feeling for the current state. I know this is supported by the Holoviews people: https://hvplot.holoviz.org/user_guide/Streaming.html

Would this require me to make my own stream with the streamz library? https://streamz.readthedocs.io/en/latest/index.html

Any hints on how to get that to work for netcdf would be wonderful!

Cheers,
Paul

pgierz
  • 674
  • 3
  • 7
  • 14

1 Answers1

0

Not sure if this solves your problem, but I have a similar use case where I want to have a bokeh plot of gridded data that get updated frequently. Because NetCDF cannot be read and write in parallel in Python this did not work out of the box because writing to the NetCDF while plotting its data would require some sort of manual synchronization.

I did not try streams yet. But I found that using zarr files instead of NetCDF is a simple solution for this problem, since you can easily append to zarr files from xarray while another application reads from the file.

cchwala
  • 530
  • 4
  • 14