0

When I open a netCDF file with xarray in Python, I open it as a Dataset object:

ds = xr.open_dataset(file_path)

How do I get the nth time slice of this dataset as a NumPy array?

I know that I can get that if I know the NetCDF variable name, like so:

xvar = ds.data_vars[var_name]
array = xvar.isel(time=n).values

but that requires knowing var_name, i.e., the NetCDF variable name, which I may not know for all netCDF files.

With iris, this name is available as the attribute var_name in the resulting Cube object after loading the netCDF file with iris.load_cube. How can I get the same variable name in xarray after loading the netCDF file into an xarray dataset?

Or is there any even simpler way to get the nth time slice of the netCDF file as a NumPy array with xarray?

HelloGoodbye
  • 3,624
  • 8
  • 42
  • 57
  • Use `array = xvar.isel(time=n).load()` instead. this will make `array` to a subset of `ds`. As a NumPy array, you really need to know the variable name. You can get the names with `ds.variables.keys()` – msi_gerva Dec 13 '22 at 21:20
  • @msi_gerva What you do you mean by "as a NumPy array, you really need to know the variable name"? The code you provided also uses `xvar`, which still requires me to know the variable name, because I need it to obtain `xvar`. – HelloGoodbye Dec 14 '22 at 16:24
  • 1
    @msi_gerva I know I can get names with `ds.variables.keys()`; the question is, which of those names is it that correspond to the actual data and not to any metadata like the coordinates or the projection? Finding that out is trivial with iris (I just do `iris.load_cube(file_path).var_name`, where `file_path` is the path to the netCDF file). The question is, how do I do it with xarray? – HelloGoodbye Dec 14 '22 at 16:32
  • what about `list(ds.variables.keys())`. Then you get really only the short variable names. You can get your data to dictionary with `datain = {vv:ds.variables[vv].load().values for vv in list(ds.variables.keys())}` or as you wanted specific timemoment: `datain = {vv:ds.isel(time=2).variables[vv].load().values for vv in list(ds.variables.keys())}` – msi_gerva Dec 14 '22 at 20:19
  • @msi_gerva How does that answer the question of which variable it is that corresponds to the actual data? – HelloGoodbye Dec 15 '22 at 12:13

1 Answers1

0

Have yout tried:

results = ds.isel(time=n).values

That should return the time slice for all the variables, as you desire. Obviously you will have problems if there are multiple variables and you only want one of them, but there is no way you can know which you want without knowing the variable name anyway, so I don't think that should really be an issue.

If you question is "how can I specifically extract only one variable from a list of others without knowing its name", then that doesn't really make sense. If you don't know the data you want, how do you expect to get the data you want?

Téo
  • 191
  • 3
  • Well, as I wrote in my question, with iris, you can extract this variable name by doing `iris.load_cube(netcdf_file_path).var_name`, which iris simply refers to as "the NetCDF variable name for the cube," and it is the variable with this name that I'm referring to. Do you think my question makes sense now? – HelloGoodbye Dec 19 '22 at 15:05
  • 1
    I've not used iris, but it looks like it implicitely assumes you have only one variable of interest. If that's the case then obviously you can just query it with the single variable name. You can do the same thing in xarray, assuming you only have one variable, using `var_name = list(ds.variables.keys())[0]` and then `ds[var_name]`. When you have more variables you obviously can't choose which one you want without knowing which one you want. – Téo Dec 20 '22 at 18:19