I would like to get the fastest solution to write data in a 2D numpy array using an array of indexes.
I have a large 2D boolean numpy array buffer
import numpy as np
n_rows = 100000
n_cols = 250
shape_buf = (n_rows, n_cols)
row_indexes = np.arange(n_rows,dtype=np.uint32)
w_idx = np.random.randint(n_cols, size=n_rows, dtype = np.uint32)
buffer = np.full(shape=shape_buf,
fill_value=0,
dtype=np.bool_,order="C")
I want to write data in the buffer using a list of indexes w_idx
data = np.random.randint(0,2, size=n_rows, dtype = np.bool_)
w_idx = np.random.randint(n_cols, size=n_rows, dtype = np.uint32)
One solution is to use standard indexing :
%timeit buffer[row_indexes, w_idx] = data
2.07 ms ± 20.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
A faster solution is to flatten the indexes and to use np.put :
%timeit buffer.put(flat_row_indexes + w_idx, data, "wrap")
1.76 ms ± 18.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
However this last solution is still to slow for my application. Is it possible to be faster ? Maybe by using another library, say Numba ?