12

I'm not sure if it's possible. I want to study OpenCL in-depth, so I was wondering if there is a tool to disassemble an compiled OpenCL kernel.

For normal x86 executable, I can use objdump to get a disassembly view. Is there a similar tool for OpenCL kernel, yet?

caf
  • 233,326
  • 40
  • 323
  • 462
Patrick
  • 4,186
  • 9
  • 32
  • 45
  • I realize to disassemble OpenCL kernel, it is very vendor dependent. Please search platform specific SDK. i.e. Nvidia and Intel OpenCL SDK, they all include some sort of disassembler for their kernel. – Patrick Aug 16 '11 at 23:20

6 Answers6

6

If you're using NVIDIA's OpenCL implementation for their GPUs, you can do the followings to disassemble an OpenCL kernel:

  1. Use clGetEventProfilingInfo() to dump the ptx code to a file, say ptxfile.ptx. Please refer to the OpenCL specification to have more details on this function.

  2. Use nvcc to compile ptx to cubin file, for example: nvcc -cubin -arch=sm_20 ptxfile.ptx will compile ptxfile.ptx onto a compute capability 2.0 device.

  3. Use cuobjdump to disassemble the cubin file into GPU instructions. For example: cuobjdump -sass ptxfile.cubin

Hope this helps.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
Zk1001
  • 2,033
  • 4
  • 19
  • 36
  • 8
    Worth noting that while all of this is correct, it is completely specific to the NVIDIA toolchain and not applicable to other vendor's implementations. – talonmies Jul 14 '11 at 06:57
  • Yes, I forgot to add that. Thanks for fixing it. – Zk1001 Jul 14 '11 at 07:49
  • Thanks! This is still worth something, I will try on cuda first, if there is a similar solution for ati and sandy bridge, I'd like to hear from you as well. – Patrick Jul 14 '11 at 15:24
  • Why would `clGetEventProfilingInfo` yield a PTX file? – einpoklum May 19 '21 at 07:21
4

I know that this is an old question, but in case someone comes looking here for disassembling a AMD GPU kernel, you can do the following in linux:

export GPU_DUMP_DEVICE_KERNEL=3

This make any kernel that is compiled on your machine dump the assembled code to a file in the same directory.

Source: http://dis.unal.edu.co/~gjhernandezp/TOS/GPU/ATI_Stream_SDK_OpenCL_Programming_Guide.pdf

Sections 4.2.1 and 4.2.2

OznOg
  • 4,440
  • 2
  • 26
  • 35
KLee1
  • 6,080
  • 4
  • 30
  • 41
  • @Oak Most of the links on AMD's site right now are dead. Use [this link](http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CDcQFjAB&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.225.1324%26rep%3Drep1%26type%3Dpdf&ei=XkmZUJy-AcK3iwKfgIG4Bg&usg=AFQjCNGWvvVOPqW8Bq8u6-sYRUsP3d9cgA&cad=rja) instead for right now. – KLee1 Nov 06 '12 at 17:32
  • 4
    `GPU_DUMP_DEVICE_KERNEL` does not work anymore: http://devgurus.amd.com/thread/159168 – gioele Dec 22 '13 at 11:14
1

The simplest solution, in my experience, is to use clangs OpenCL C compiler and emit SPIR. It even works on Godbolt's compiler explorer: https://godbolt.org/z/_JbXPb

Clang can also emit ptx (https://godbolt.org/z/4ARMqM) and amdhsa (https://godbolt.org/z/TduTZQ), but it may not correspond to the ptx and amdhsa assembly generated by the respective driver at runtime.

0

If you work with an AMD GPU, you can use the Analyzer tool. It is free, cross-platform, and comes in two forms:

  1. Command line tool (ships as part of the CodeXL package, search for the CodeXLAnalyzer executable after installing).
  2. CodeXL GUI application (just switch to the Analyzer mode in CodeXL).

Here is a short summary of what you can do with the Analyzer:

  1. Compile OpenCL kernels, OpenGL shaders and D3D shaders for any GPU that is supported by the installed driver (even without having the GPU physically installed on your system), and get the ISA. Using CodeXL Analyzer (option #2 above), you can get additional information such as an estimation for the number of clock cycles that are required to execute the instruction.
  2. View the compiler-generated statistics (SGPRs usage, VGPRs usage, etc.)
  3. Generate the AMD IL code for the OpenCL kernel.
  4. Export the compiled binaries (ELF, in binary format).

You can download the CodeXL tool suite from here: https://gpuopen.com/compute-product/codexl/

OznOg
  • 4,440
  • 2
  • 26
  • 35
AmitB
  • 724
  • 8
  • 5
  • When you say "any GPU supported by the installed driver", does this include nVidia and Intel? Can I use it to create binaries for (say) nVidia and AMD at the same time? – barneypitt May 05 '18 at 18:59
  • No, this is just for AMD GPUs currently. – AmitB May 07 '18 at 02:18
0

As AMD CodeXLAnalyzer not not supported anymore use Radeon GPU Analyzer

Boris Ivanov
  • 4,145
  • 1
  • 32
  • 40
0

On Intel, you can use cliloader from https://github.com/intel/opencl-intercept-layer

If you run that with the --dump-kernel-isa-binaries flag, you will get a binary in $HOME/CLIntercept_Dump/*/*.isabin for each kernel.

This can then be disassembled using the iga64 tool as explained in this guide.

It will get you the disassembly, but I find the variable names, line numbers, etc, are all lost, so interpreting the assembly code is not easy.

Bram
  • 7,440
  • 3
  • 52
  • 94