1

I am converting code to use dask arrays instead of numpy arrays.

I have dask version 0.19.4 installed.

I have an array that has been replaced by an equivalent dask array:

# this used to be a numpy array, created like so:
# da = np.zeros((total_months,))
da = dask.array.zeros((total_months,), chunks=(-1))

When I hit the following line of code (which previously worked when the array was a numpy array), I get an item assignment error:

for k in range(total_months):
    da[k] = awc - s0

NotImplementedError: Item assignment with <class 'int'> not supported

awc and s0 are both float scalars in the above.

My understanding is that normal numpy-style slicing and array element indexing works as expected with dask, but that isn't happening here. Where might I have gone astray?

halfer
  • 19,824
  • 17
  • 99
  • 186
James Adams
  • 8,448
  • 21
  • 89
  • 148

1 Answers1

2

Correct. Item assignment with integers is not supported.

Generally workflows that iterate over large arrays with for loops aren't scalable. Dask array makes no attempt to support them.

MRocklin
  • 55,641
  • 23
  • 163
  • 235
  • 1
    Thanks for the confirmation, @MRocklin. To further clarify: I have a function that takes a 1-D vector (time series) and computes a corresponding time series (i.e. `ts_result = f(ts_input)`), the overall data is stored in a 3-D array with dims (lat, lon, time), and the function is applied to each lat/lon. If the array of input/output is too big for memory (e.g. when lat and lon are large) then dask will be of no help, since the function itself makes use of numpy-style element assignment via integer indices? I'm essentially looking for a way to map the function to each time series in the array. – James Adams Oct 23 '18 at 13:17
  • Update: As of dask version 2021.04.1, numpy-style item assignment is now support - see https://stackoverflow.com/a/67535814/8467988 for more details. – dhassell May 14 '21 at 14:36