2

First of all, I'm sorry for the not very explicit title, I just couldn't find anything better. I have a silly question, but I've been stuck on it for hours.

Basically I have an xarray DataSet in which is a data variable called data_index with integers from 0 to 3. These integers correspond to the indexes of a list. I would like to add a data variable to my DataSet that matches the list value for the index given by the data_index variable.

Here is an exemple of what I have :

import xarray as xr

ds = xr.Dataset()
ds["data_index"] = (("x", "y"), [[0, 1], [2, 3]] )

list = ["a", "b" , "c", "d"]

I'd like to add a data variable called data to the dataset ds by picking the value of the list that correspond to the index data_index. Would be something like : ds["data"] = list[ds["data_index"]]

My DataSet is much bigger than that. The real dimensions are (x: 30001, y: 20001). But the variable data_index contains only integers from 0 to 3 and the list is also 4-element long.

I'm sure there is an easy way to do it but I just can't find it. Do you have any leads?

Sebastian Wozny
  • 16,943
  • 7
  • 52
  • 69
Morgane
  • 23
  • 5

1 Answers1

2

To get the result, you can use the isel method provided by xarray. Allowing you to index a dataset along with a particular dimension using int based indexing.

Edit:

Your right. I've actually added an additional library called numpy to help with reshaping the resulting data array.

import xarray as xr
import numpy as np

ds = xr.Dataset()
ds["data_index"] = (("x", "y"), [[0, 1], [2, 3]])

my_list = ["a", "b", "c", "d"]

data_values = np.array([my_list[idx] for idx in ds["data_index"].values.flatten()])
data_reshaped = data_values.reshape(ds["data_index"].shape)

ds["data"] = xr.DataArray(data_reshaped, dims=("x", "y"))

print(ds)
jagmitg
  • 4,236
  • 7
  • 25
  • 59
  • Hi, thanks for your answer. This solution almost works, but the dimensions of [my_list[idx] for idx in ds["data_index"].values.flatten()] doesn't match the dimensions ("x", "y"). The shape is flat. Is there a way to reshape the DataArray? – Morgane Jul 13 '23 at 15:07
  • Ok so this is working: `ds["data"] = xr.DataArray(np.reshape([my_list[idx] for idx in ds["data_index"].values.flatten()], ds.data_index.shape), dims=ds.data_index.dims)` But i'm wondering if it is the optimal solution for large dataset? – Morgane Jul 13 '23 at 15:20
  • @Morgane just updated – jagmitg Jul 13 '23 at 16:02
  • 1
    Thanks a lot for your help! – Morgane Jul 13 '23 at 16:36