2

I am using clang to generate LLVM IR for Nvidia OpenCL and Cuda kernels, which i want to subsequently instrument, doing something like this for OpenCL:

clang -c -x cl -S -emit-llvm -cl-std=CL2.0 kernel.cl -o kernel.ll

and what's described here for Cuda.

What i am looking for is a way to go from the instrumented IR to an actual binary. For the case of Cuda i know i can use the NVPTX backend to generate PTX and JIT compile as described here (or perhaps use ptxas?). I was wondering if something similar is also possible for the OpenCL case, and if so, perhaps a minimal example. Thanks in advance.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
0x6K5
  • 57
  • 6

1 Answers1

2

You can in principle extract binaries for loaded and compiled OpenCL kernels by using clGetProgramInfo() with CL_PROGRAM_BINARY_SIZES and CL_PROGRAM_BINARIES.

As far as I'm aware, this will produce binaries in an entirely implementation-defined format. So if you're unlucky, you just get IR code back anyway. With any luck, it might contain PTX machine code on your platform, however.

pmdj
  • 22,018
  • 3
  • 52
  • 103