3

I have code, that I want to parallelize with cupy. I thought it would be straight forward - just write "import cupy as cp", and replace everywhere I wrote np., with cp., and it would work.

And, it does work, the code does run, but takes much slower. I thought it would eventually be faster, compared to numpy, when iterating through larger arrays, but it seems that never happens.

The code is:

q = np.zeros((5,5))
q[:,0] = 20

def foo(array):

    result = array
    shedding_row = array*0
    for i in range((array.shape[0])):
        for j in range((array.shape[1])-1):

            shedding_param = 2 * (result[i,j])**.5             
            shedding = (np.random.poisson( (shedding_param), 1))[0]

            if shedding >= result[i,j]:
                shedding = result[i,j] - 1

            result[i,j+1] = result[i,j] - shedding

            if result[i,j+1]<0:
                result[i,j+1] = 0

            shedding_row[i,j+1] = shedding  

    return(result,shedding_row)

x,y = foo(q)

Is this supposed to get faster with cupy? Am I using it wrong?

Ipulatov
  • 175
  • 4
  • 11

1 Answers1

2

To get fast performance of numpy or cupy, you should use parallel operation instead of using for loop.

Just for example,

for i in range((array.shape[0])):
    for j in range((array.shape[1])-1):

        shedding_param = 2 * (result[i,j])**.5

This can be calculated as

xp = numpy  # change to cupy for GPU
shedding_param = 2 * xp.sqrt(result[:, :-1])
corochann
  • 1,604
  • 1
  • 13
  • 24