OpenCL multiple Kernel execution on NVIDIA Tesla

Question

I have this Problem and I don't know how to solved it.

I work with 2 Cluster one with 6 Tesla C1060 and another one with 2 Tesla K20M.

I have 2 OpenCL-Program using JOCL as Java Bindings. The First one have this structure :

1 OpenCL Kernel
...code...
clEnqueueNDRangeKernel(commandQueues[i], kernel[i], 1, null,
                global_work_size, local_work_size, 0, null, events[i]);
clFlush(commandQueues[i]);

This one work in both Computer Cluster with Tesla C1060 and Tesla K20M.

The Second Program have this structure :

4 OpenCL Kernel
...code...
clEnqueueNDRangeKernel(commandQueues[i], kernel1[i], 1, null,
                global_work_size, local_work_size, 0, null, events[i]);
clEnqueueNDRangeKernel(commandQueues[i], kernel2[i], 1, null,
                global_work_size, local_work_size, 0, null, events[i]);
clEnqueueNDRangeKernel(commandQueues[i], kernel3[i], 1, null,
                global_work_size, local_work_size, 0, null, events[i]);
...code...
read result from 3rd Kernel and do a little data comparison
...code...
clEnqueueNDRangeKernel(commandQueues[i], kernel4[i], 1, null,
                global_work_size, local_work_size, 0, null, events[i]);
clFlush(commandQueues[i]);

I got the expected result, but just from the Cluster with 2 Tesla K20M. From the other cluster with 6 Tesla C1060, I got the wrong result (The Programm starts and ends normal,but delivers wrong result). I've try it with only 1, 2, 3, 4, 5 TeslaC1060 and everytime I get the wrong result.

I need Help to find out if it Hardware-problem that cause this, or do I have to try to change, how the multiple kernel execution start? Maybe I have to read the result first everytime I execute the kernel and after that I send it to the next kernel ?

I'll appreciate any help.

Thank you

It is a good idea to get the error codes of the OpenCL calls "ALWAYS". Sometimes queueing the kernel does not fail but reading the result does fail. That does not mean a crash, but an incorrect result. First see if you have no error in the API calls. For example, an incompatible value of local group size. — DarkZeros, Oct 16 '13 at 11:20

OpenCL multiple Kernel execution on NVIDIA Tesla

0 Answers0