PyCUDA: GPUArray.get() returns inaccessible array

Question

I am trying to sum up an array in the GPU and then obtain it back on the host. For this, I am using the pycuda.gpuarray.sum() function.

import pycuda.gpuarray
a = np.array([1,2,3,4,5])
b = gpuarray.to_gpu(a)
c = gpuarray.sum(b)
c = c.get()
print(c)   #Prints array(15)
print(type(c)) #Prints numpy.ndarray
print(c[0]) #Error, Index error - too many indices for array
print(c.shape) #Prints (), empty tuple

How to I obtain the solution of the sum() function back as a normal integer element?

On my particular pycuda install, when I `print(c)` I get `15` and not anything else. — Robert Crovella, Dec 10 '20 at 17:32

score 2 · Accepted Answer · answered Dec 12 '20 at 05:57

the function gpuarray.sum() just returns a scalar after summing up all the element, as @Robert Crovella said. So your variable c will always get initialized as a 0-dimensional numpy array (in other words, a scalar), which is why you get the empty tuple as an output, and an error, when you try to access an element within.

If you want this to be a 1-dimensional array you can do this:

import pycuda.autoinit
import pycuda.gpuarray as gpuarray
import numpy as np

a = np.array([1,2,3,4,5])
d_a = gpuarray.to_gpu(a)
d_c = gpuarray.sum(d_a)

h_d = np.zeros((1,))
h_d[0] = d_c.get()
print("np.array h_d: ", h_d)
print("h_d[0] = ", h_d[0])

(PyCuda 2020.1)

PyCUDA: GPUArray.get() returns inaccessible array

1 Answers1