23

I am using ATI RV770 graphics card, OpenCl 1.0 and ati-stream-sdk-v2.3-lnx64 on linux.

While running my host code which includes following two sections to build kernel program, i am getting error code (-11) i.e. cl_build_program_failure. Does it means that kernel program compiled, if not then how is it compiled and debugged?

const char* KernelPath = "abc_kernel.cl";   //kernel program is in separate file but in same directory of host code..

/* Create Program object from the kernel source *******/

char* sProgramSource = readKernelSource(KernelPath);
size_t sourceSize =  strlen(sProgramSource) ;
program = clCreateProgramWithSource(context, 1,(const char **) &sProgramSource,&sourceSize, &err);
checkStatus("error while creating program",err);

/* Build (compile & Link ) Program *******/

char* options = (char* )malloc(10*sizeof(char));
strcpy(options, "-g");
err = clBuildProgram(program, num_devices, devices_id, options, NULL, NULL);
checkStatus("Build Program Failed", err); //This line throwing the error....

function to read kernel program is as follows::

/* read program source file*/

char* readKernelSource(const char* kernelSourcePath){
 FILE    *fp = NULL;
 size_t  sourceLength;
 char    *sourceString ;
 fp = fopen( kernelSourcePath , "r");
 if(fp == 0)
 {
        printf("failed to open file");
        return NULL;
 }
 // get the length of the source code
 fseek(fp, 0, SEEK_END);
 sourceLength = ftell(fp);
 rewind(fp);
 // allocate a buffer for the source code string and read it in
 sourceString = (char *)malloc( sourceLength + 1);
 if( fread( sourceString, 1, sourceLength, fp) !=sourceLength )
 {
          printf("\n\t Error : Fail to read file ");
          return 0;
 }
 sourceString[sourceLength+1]='\0';
 fclose(fp);
 return sourceString;

}// end of readKernelSource

Can anyone tell how to fix it?

Does it means that it is OpenCl compilation error at runtime or something else?

//Printing build_log info using clGetProgramBuildInfo() as below, But why is is not printing anything?

char* build_log; size_t log_size;

// First call to know the proper size
        err = clGetProgramBuildInfo(program, devices_id, CL_PROGRAM_BUILD_LOG, 0, NULL, &log_size);
        build_log = (char* )malloc((log_size+1));

        // Second call to get the log
        err = clGetProgramBuildInfo(program, devices_id, CL_PROGRAM_BUILD_LOG, log_size, build_log, NULL);
        build_log[log_size] = '\0';
        printf("--- Build log ---\n ");
        fprintf(stderr, "%s\n", build_log);
        free(build_log);
Gopal
  • 765
  • 1
  • 7
  • 19
  • Did you paste the kernel source into a 3rd party tool of some kind? A profiler maybe? When I get build errors, they are usually syntax within the program itself. – mfa Feb 27 '12 at 11:29

2 Answers2

45

This error is typically caused by a syntax error in your kernel code. You can call the OpenCL function clGetProgramBuildInfo with the flag CL_PROGRAM_BUILD_LOG to access the log generated by the compiler. This log contains the output you are probably used to when compiling on the command-line (errors, warnings, etc.).

For example, you could add something similar to the following after you call clBuildProgram:

if (err == CL_BUILD_PROGRAM_FAILURE) {
    // Determine the size of the log
    size_t log_size;
    clGetProgramBuildInfo(program, devices_id[0], CL_PROGRAM_BUILD_LOG, 0, NULL, &log_size);

    // Allocate memory for the log
    char *log = (char *) malloc(log_size);

    // Get the log
    clGetProgramBuildInfo(program, devices_id[0], CL_PROGRAM_BUILD_LOG, log_size, log, NULL);

    // Print the log
    printf("%s\n", log);
}

You can also see the function buildOpenCLProgram() in SDKCommon.cpp in the AMD APP SDK for a real example.

Michael Boyer
  • 1,005
  • 1
  • 10
  • 9
  • I did the same to print build_log info as shown above. But it is printing anything, even i used malloc() to create storage for build_log. – Gopal Feb 27 '12 at 16:35
  • I'm guessing you mean "it is **not** printing anything"? Can you confirm that the printf is actually being called? You might need to initialize the log first to ensure that the string is null-terminated. Try adding "memset(log, 0, log_size);" after the call to malloc. – Michael Boyer Feb 27 '12 at 16:47
  • i tried "memset((void *)build_log, 0, sizeof(build_log));" but at this line it is giving segmentation fault... – Gopal Feb 27 '12 at 17:13
  • sizeof(build_log) is wrong; build_log is a pointer, so sizeof(build_log) is the size of a pointer. What you really want to pass to memset is the number of bytes you allocated (log_size). – Michael Boyer Feb 27 '12 at 21:26
  • Well, after applying clgetProgrambuidInfo() i got the build status that no build has been performed on the specified program object for device. Does means that kernel code not compiled? Please help!!! How can we compile our kernel code successfully? – Gopal Feb 28 '12 at 05:15
  • OK, I just noticed that you edited your original post to add your error checking code. Note that clGetProgramBuildInfo expects to be passed a single device as the second parameter, whereas clBuildProgram expects a _list_ of devices. But you are passing the same variable to both (I'm actually surprised that this compiles). You need to modify your code to pass a single device, as in the example I gave. – Michael Boyer Feb 29 '12 at 20:21
  • 1
    @Gopal use `calloc()` instead of `malloc()` to zero it (very late contribution ;)). – Matthieu Dec 03 '14 at 21:57
  • passing &log_size instead of NULL will cause CL_INVALID_VALUE error due to keeping callback function pointer NULL as mentioned here: https://www.khronos.org/registry/OpenCL/sdk/1.0/docs/man/xhtml/clBuildProgram.html – Amir ElAttar May 11 '17 at 14:06
0

The problem here is that the program buffer is unterminated.

How do I know this? I had an OpenCL program that was working just fine until I disabled all of the linux kernel mitigations in my kernel. Trying to run the program several days later resulted in similar OpenCL compilation errors for stuff that clearly isn't in the CL source file. Rebuilding the program didn't fix the issue. The string that fread() fills if you puts() it you'll see the garbage at the end of the program buffer.

My solution was to buffer[length] = '\0'; after allocating an extra byte for termination purposes. Other solutions could be to use calloc(). I suspect that one of the mitigation's was making sure that allocated memory is cleared before being handed to user space.

Eric Aya
  • 69,473
  • 35
  • 181
  • 253