0

Context:

I have a large 3D array input_array (size: 1000 x 100 x 50) for which I want to do an operation on specific pixels (their indices are gathered in array idx_valid). Everytime I do an operation on one of the pixels, I grab the output and put it in the output_array (size 300 x 100 x 50). I also have 3 other input arrays (dates, dt_start, dt_end) that are used in my real code but not in this example.

Goal:

I want to parallelize these operations because each pixel operation can be treated individually. For that I use Numba, but I can not manage to initialize properly the decorator.

Decorator description:

I have 6 input in my function that I parallelize: input_array, idx_valid , dates, dt_start, dt_end, 100. Their formats are respectively: float64[:,:,::1], int64[:,::1], float64[::1], float64[::1], float64[:1], int_. I have 1 output for my function: output_array with format float64[:,:,::1].

Questions:

  1. How can I initialize the decorator without getting the following error: TypeError: No matching definition for argument type(s) array(float64, 3d, C), array(int32, 2d, C), array(float64, 1d, C), array(float64, 1d, C), array(float64, 1d, C), int64
  2. If I write my decorator @nb.njit(parallel=True, fastmath=True) it doesn't work, but with @nb.jit(parallel=True, fastmath=True) (without the "n") the code works. Why is that ?

Code Example:

import numpy as np
import numba as nb

# Generate random temporal matix time*y*x
input_array = np.random.rand(1000, 100, 50)

# Generate a 2x10000 array of random pixel indices of the 2D footprint of input_array (x*y): 100x50 
idx_valid = np.empty((2, 1000), dtype=int)
idx_valid[0] = np.random.randint(0, input_array.shape[1], size=1000)
idx_valid[1] = np.random.randint(0, input_array.shape[2], size=1000)

# Generate temporal matrices, numerical format
dates = np.random.rand(300) # That one is only a fraction of the temporal length of input_array
dt_start = np.random.rand(1000)
dt_end = np.random.rand(1000)   


# get_results + interpolator
@nb.njit(nb.float64[:, :, ::1](nb.float64[:,:,::1], nb.int64[:,::1], nb.float64[::1], nb.float64[::1], nb.float64[::1], nb.int_), parallel=True, fastmath=True)
def invertor_parallelized(input_array, idx_valid, dates, dt_start, dt_end, lamb):

    # create the output array
    output_array = np.full((len(dates), input_array.shape[1], input_array.shape[2]), np.nan)


    # Initialize matrices for the operation
    A = np.random.rand(1000, 300)  # create a 40x300 random array
    LHS = np.random.rand(300, 300)  # create a 300x300 random array

    # Where the parallelization happens, we iterate through the different valid pixels of input_array
    for i in nb.prange(idx_valid.shape[1]):

            # operation (it's an inversion)
            output_array[:,idx_valid[0][i],idx_valid[1][i]] = np.linalg.solve(LHS, np.matmul(np.transpose(A), input_array[:,idx_valid[0][i],idx_valid[1][i]]))

    return output_array


nb.set_num_threads(6)
# Example of usage, with lamb = 100:
invertor_parallelized(input_array, idx_valid, dates, dt_start, dt_end, 100)

Code that works:

# Generate random temporal matix time*y*x
input_array = np.random.rand(1000, 100, 50)

# Generate a 2x10000 array of random pixel indices of the 2D footprint of input_array (x*y): 100x50 
idx_valid = np.empty((2, 1000), dtype=int)
idx_valid[0] = np.random.randint(0, input_array.shape[1], size=1000)
idx_valid[1] = np.random.randint(0, input_array.shape[2], size=1000)

# Generate temporal matrices, numerical format
dates = np.random.rand(300) # That one is only a fraction of the temporal length of input_array
dt_start = np.random.rand(1000)
dt_end = np.random.rand(1000)



# get_results + interpolator
@nb.jit(parallel=True, fastmath=True)
def invertor_parallelized(input_array, idx_valid, dates, dt_start, dt_end, lamb):

    # create the output array
    output_array = np.full((len(dates), input_array.shape[1], input_array.shape[2]), np.nan)

    # Initialize matrices for the operation
    A = np.random.rand(1000, 300)  # create a 40x300 random array
    LHS = np.random.rand(300, 300)  # create a 300x300 random array

    # Where the parallelization happens, we iterate through the different valid pixels of input_array
    for i in nb.prange(idx_valid.shape[1]):

            # operation (it's an inversion)
            output_array[:,idx_valid[0][i],idx_valid[1][i]] = np.linalg.solve(LHS, np.matmul(np.transpose(A), input_array[:,idx_valid[0][i],idx_valid[1][i]]))

    return output_array


nb.set_num_threads(6)
# Example of usage, with lamb = 100:
invertor_parallelized(input_array, idx_valid, dates, dt_start, dt_end, 100)
Nihilum
  • 549
  • 3
  • 11
  • The signature warning is because of the dtype of `idx_valid`, just initialize that with the same type. So instead of `dtype=int` use `dtype=np.int64`. If `njit` doesn't work, it probably means `jit` falls back to object-mode (which you probably don't want to use). The warning is pretty clear, it's because of the use of `np.matmul`, can you replace that in this case with `@`? – Rutger Kassies May 03 '23 at 06:31

0 Answers0