0

I'm absolutely new to OpenCL programming. I have a working installation of OpenCL library and drivers. But the program I'm trying to run is not producing expected output (Output is all zeros). It is just a simple vector_add program. Thanks in advance for suggestions.

int main(int argc, char** argv)
{
cout << "Hello OpenCL" << endl;

vector<Platform> all_platforms;
int err = Platform::get(&all_platforms);
cout << "Getting Platform ... Error code " << err << endl;
if (all_platforms.size()==0)
    (cout << "No platforms" << endl, exit(0));
cout << "Platform info : " << all_platforms[0].getInfo<CL_PLATFORM_NAME>() << endl;
Platform default_platform = all_platforms[0];
cout << "Defaulting to it ..." << endl;

vector<Device> all_devices;
err = default_platform.getDevices(CL_DEVICE_TYPE_GPU, &all_devices);
cout << "Getting devices ... Error code : " << err << endl;
if (all_devices.size()==0)
    (cout << "No devices" << endl, exit(0));
Device default_device = all_devices[0];
cout << all_devices.size() << " devices & " << "Device info : " << all_devices[0].getInfo<CL_DEVICE_NAME>() << endl;
cout << "Defaulting to it ..." << endl;

Context context(default_device);
Program::Sources sources;

std::string kernel_code=
        "   void kernel simple_add(global const int* A, global const int* B, global int* C){"
        "   unsigned int i = get_global_id(0);  "
        "       C[i]=A[i]+B[i];                 "
        "   }                                                                               ";

sources.push_back(make_pair(kernel_code.c_str(), kernel_code.length()+1));
Program program(context, sources);

if (program.build(all_devices)==CL_SUCCESS)
    cout << "Built Successfully" << endl;

Buffer buffer_A(context,CL_MEM_READ_WRITE,sizeof(int)*10);
Buffer buffer_B(context,CL_MEM_READ_WRITE,sizeof(int)*10);
Buffer buffer_C(context,CL_MEM_READ_WRITE,sizeof(int)*10);

int A[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
int B[] = {0, 1, 2, 0, 1, 2, 0, 1, 2, 0};

CommandQueue queue(context,default_device);
queue.enqueueWriteBuffer(buffer_A,CL_TRUE,0,sizeof(int)*10,A); // load data from host to device
queue.enqueueWriteBuffer(buffer_B,CL_TRUE,0,sizeof(int)*10,B);

Kernel kernel(program, "vector_add");
kernel.setArg(0, buffer_A);
kernel.setArg(1, buffer_B);
kernel.setArg(2, buffer_C);

queue.enqueueNDRangeKernel(kernel,cl::NullRange,cl::NDRange(10),cl::NullRange);
queue.finish();

int *C = new int[10];
queue.enqueueReadBuffer(buffer_C, CL_TRUE, 0, 10 * sizeof(int), C);

for (int i=0;i<10;i++)
    std::cout << A[i] << " + " << B[i] << " = " << C[i] << std::endl;

return 0;
}
ayandas
  • 2,070
  • 1
  • 13
  • 26
  • You should check the errors at every call, or enable C++ exceptions for OpenCL. Otherwise you may miss any function returning an error. – DarkZeros Nov 18 '15 at 10:15
  • cl::NDRange(10) must be invalid. You should at least do 32,64,128 or 8192 whatever multiple of 64 suits you. cl::NDRange(64) – huseyin tugrul buyukisik Nov 18 '15 at 20:28
  • Any global size should be valid, regardless of the internal execution schedule. So, You can use 1, 3, 5, 7, 13, 59 if you like. That is not likely to be the problem. – DarkZeros Nov 19 '15 at 10:47

1 Answers1

3

As pointed out in the comments, you should always check the error codes when using OpenCL API functions. This can be achieved by enabling exception handling in the C++ wrapper:

#define __CL_ENABLE_EXCEPTIONS      // with cl.hpp
//#define CL_HPP_ENABLE_EXCEPTIONS  // with cl2.hpp

#include <CL/cl.hpp>

int main(int argc, char *argv[])
{
  try
  {
    // OpenCL code here
  }
  catch (cl::Error& err)
  {
    cout << err.what() << " failed with error code " << err.err() << endl;
  }
}

If you do this, you will receive useful information about a couple of issues with your code.

The clCreateKernel function returns CL_INVALID_NAME. In your kernel, you define the kernel function with the name simple_add, but then you try and create a kernel object using the name vector_add.

If you have an OpenCL platform with multiple devices, you may also receive an error when building your kernel program. This is because you are creating an OpenCL context with a single device, but then trying to build the program for a list of devices:

Context context(default_device);
// ...
if (program.build(all_devices)==CL_SUCCESS)
  cout << "Built Successfully" << endl;

The simplest fix is just to remove the argument from the build function, since by default it will build the program for all devices in the context (which is almost always what you actually want):

if (program.build()==CL_SUCCESS)
  cout << "Built Successfully" << endl;
jprice
  • 9,755
  • 1
  • 28
  • 32
  • Good answer, for actually testing the code and providing all the feedback. Really, enabling exception is a critical feature all C++ OpenCL developer should do straight away. – DarkZeros Nov 19 '15 at 17:29