In Python I have a list of lists as input:
input = [[0,1,2],[0,3,4,5],[0,6]]
In reality, the number of sub-lists is tens of thousands. The length of each sub-list could vary greatly, from zero or one value to many hundreds.
I want to pass the input data as some 2D structure to a Cython module that will process it. I wish to process the data on multiple cores, thus I use prange
with nogil=True
:
from cython.parallel import prange
cpdef int my_func(long[:,:] arr):
cdef int i,j
for i in prange(arr.shape[0], nogil=True):
for j in range(arr.shape[1]):
# Do something
pass
return 42
I see the following solutions:
- Put the list of lists into a 2D ndarray. But as the length of each sub-list vary greatly, a ndarray is not an ideal data structure
- Modify
my_func
to accept a list of lists. The problem is that part of the code is executed without the GIL and therefore not able to access python objects.
Does anyone have suggestions, preferably with code, on how to solve this problem?