0

I am having trouble neatly joining numpy arrays that are in a list or from a generator in numba jited functions. The simplest example is as follows:

import numpy as np
from numba import njit

@njit
def my_program():
    array_1 = np.array([0,3])
    array_2 = np.array([0,4,2,3])
    array_3 = np.array([9,1,3,3,5,9])
    
    list_of_arrays = [array_1, array_2, array_3]
    
    return join_arrays(list_of_arrays)

my_program()

Where join_arrays should return a 1-D numpy array like np.array([0,3,0,4,2,3,9,1,3,3,5,9]).

I have tried:


@njit
def join_arrays_a(list_of_arrays):
    return np.hstack(list_of_arrays)

@njit
def join_arrays_b(list_of_arrays):
    return np.hstack(tuple(list_of_arrays))

@njit
def join_arrays_c(arrays):
    tot_len = sum(list(map(len, arrays)))
    new_array = np.zeros(shape=(tot_len,), dtype=np.int64)
    prev_array_lens = 0
    for array in arrays:
        for i in range(len(array)):
            new_array[prev_array_lens+i] = array[i]
        prev_array_lens += len(array)
    return new_array

join_arrays_c works but I would hope that there was a better way to do this, or some utility functions somewhere so that I do not have to write so much code like this. Am I missing something? join_arrays_a and join_arrays_b don't work when used in the my_program instead of join_arrays_c.

Patrick
  • 53
  • 5
  • AFAIK, no, the last function is the standard current way to do that in Numba (though you can assign views). It might change in future versions of Numba though. – Jérôme Richard Feb 24 '23 at 22:38
  • Hi Jérôme, are there any good utility libraries that include functions like this? It seems like a very common task. – Patrick Feb 24 '23 at 22:46
  • IMO, this is an issue of Numba that needs to be addressed (AFAIK there is an open issue on this). Using external modules to patch Numba does not seems a great long-lasting solution... The best thin is likely to just wait (and possibly push developers to implement this on GitHub). Note however that creating a new temporary is generally not efficient and it is better to write results in an existing array. – Jérôme Richard Feb 24 '23 at 23:38

0 Answers0