My program has two kernels and the second kernel should use the already uploaded input data and the results from the first kernel, so I can save the memory transfers. How would I archive this?
This is how I launch my kernels:
result = gpuarray.zeros(points, dtype=np.float32)
kernel(
driver.In(dataT),result,np.int32(points),
grid = (blocks,1),
block = (block_size, 1, 1),
)