I'm trying to implement a gaussian filter for images from a code I found online, using Python and PyOpenCL. My original images are numpy arrays, but I am confused as in which should I use to pass the images to the GPU.
Originally, the kernel receives OpenCL Images as input. This works fine and the kernel runs properly, however, I have not found a way to convert the output of the GPU computation (also an OpenCL Image) to a numpy array. This is needed, since I will have to carry out other computations after running the GPU filter.
I tried using pyOpenCL Array, but had 2 problems in that case:
- Did not know how to tell the kernel that the input would be an Array, since it is a pyOpenCL data structure, not a OpenCL one.
- Did not find an equivalent of
read_imagef
to be used on pyOpenCL Arrays, and I use that function in my kernel. - Could not get the GPU result copied back to the host. I would keep getting a '
cl_array
does not have module get()' error.
I would like to know:
- Is there a way to tell the kernel that it will receive an array, just as I use
image2d_t
to say that the input is an Image? - What could I use as an equivalent to OpenCL's
read_imagef
for pyOpenCL Arrays?
Thanks a lot in advance. Kernel code below:
Kernel:
__kernel void gaussian(__read_only image2d_t inputImage,
__read_only image2d_t filterImage,
__write_only image2d_t outputImage,
const int nInWidth,
const int nFilterWidth){
const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP_TO_EDGE | CLK_FILTER_NEAREST;
const int xOut = get_global_id(0);
const int yOut = get_global_id(1);
float4 sum = (float4)(0.0, 0.0, 0.0, 1.0);
for(int r = 0; r < nFilterWidth; r++){
for(int c = 0; c < nFilterWidth; c++){
int2 location = (xOut + r, yOut + c);
float4 filterVal = read_imagef(filterImage, sampler, location);
float4 inputVal = read_imagef(inputImage, sampler, location);
sum.x += filterVal.x * inputVal.x;
sum.y += filterVal.y * inputVal.y;
sum.z += filterVal.z * inputVal.z;
sum.w = 1.0;
}
}
int2 outLocation = (xOut, yOut);
write_imagef(outputImage, outLocation, sum);
}