I have some code that uses Numba cuda.jit in order for me to run on the gpu, and I would like to layer dask on top of it if possible.
Example Code
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from numba import cuda, njit
import numpy as np
from dask.distributed import Client, LocalCluster
@cuda.jit()
def addingNumbersCUDA (big_array, big_array2, save_array):
i = cuda.grid(1)
if i < big_array.shape[0]:
for j in range (big_array.shape[1]):
save_array[i][j] = big_array[i][j] * big_array2[i][j]
if __name__ == "__main__":
cluster = LocalCluster()
client = Client(cluster)
big_array = np.random.random_sample((100, 3000))
big_array2 = np.random.random_sample((100, 3000))
save_array = np.zeros(shape=(100, 3000))
arraysize = 100
threadsperblock = 64
blockspergrid = (arraysize + (threadsperblock - 1))
d_big_array = cuda.to_device(big_array)
d_big_array2 = cuda.to_device(big_array2)
d_save_array = cuda.to_device(save_array)
addingNumbersCUDA[blockspergrid, threadsperblock](d_big_array, d_big_array2, d_save_array)
save_array = d_save_array.copy_to_host()
If my function addingNumbersCUDA
didn't use any CUDA I would just put client.submit
in front of my function (along with gather after) and it would work. But, since I'm using CUDA putting submit in front of the function doesn't work. The dask documentation says that you can target the gpu, but it's unclear as to how to actually set it up in practice. How would I set up my function to use dask with the gpu targeted and with cuda.jit if possible?