How is control flow extracted from a SYCL kernel?

Question

Using SYCL to run code on any OpenCL device doesn't require a custom compiler, as everything is done in a library (full of template magic), and a standard GCC/Clang will do just fine. Is this correct? (Especially in the case of triSYCL, which I'm using...)

If so... I know that simple expression trees can be extracted by overloading a bunch of operators on custom "handle" or "wrapper" classes, but this is not the case with control flow. Am I wrong?

Section 3.1 of this paper discusses the pros and cons of a few different approaches to adding EDSLs to C++, but I'm more interested in the actual technical implementation of the method SYCL uses.

I tried to look at the source at some SYCL-related projects (Eigen, TensorFlow, triSYCL, ComputeCpp, etc.) but so far I could not find the answer in them.

So: How can a SYCL library(?) discover the full control flow graph of a kernel, given as an ordinary C++ lambda, without needing a custom/extended compiler?

score 2 · Accepted Answer · answered Jan 25 '18 at 10:02

I think you are right.

If you compile SYCL for CPU, since SYCL is a pure C++ executable DSEL, you can have an implementation that just uses a normal C++ compiler. This is how triSYCL works for example. https://github.com/triSYCL/triSYCL

I do not know the detail about ComputeCpp. On https://github.com/triSYCL/triSYCL/blob/master/doc/about-sycl.rst there is a link about a very interesting but old presentation:

Implementing the OpenCL SYCL Shared Source C++ Programming Model using Clang/LLVM, Gordon Brown. November 17, 2014, Workshop on the LLVM Compiler Infrastructure in HPC, SuperComputing 2014 http://www.codeplay.com/public/uploaded/publications/SC2014_LLVM_HPC.pdf

In the case triSYCL is targeting a device, there is also a device compiler. I have to push a new version with a design document... In the meantime, you can look at https://github.com/triSYCL/triSYCL/tree/device https://github.com/triSYCL/llvm https://github.com/triSYCL/clang

sycl-gtx is using some SYCL syntax extensions based on macros to have a meta-representation of the control flow in the kernel, as shown for example on this example: https://github.com/ProGTX/sycl-gtx/blob/master/tests/regression/work_efficient_prefix_sum.cpp

score 1 · Answer 2 · answered Jan 24 '18 at 17:04

And the answer is: This is not how it's done, and I still don't think it's possible.

Even my first assumption was wrong. If all you have is an ordinary C++ compiler, then any SYCL kernel can only be executed "in software", by the host device (CPU) running the "controller" code.

To translate the kernels to OpenCL (or SPIR-V) for execution on any other device, either an "augmented" compiler is necessary; or two compilers, one for the host, and one for the compute device.

A good explanation can be found here: https://www.codeplay.com/portal/introduction-to-sycl

The most related section is "What Would A SYCL Work Flow Look Like?"

How is control flow extracted from a SYCL kernel?

2 Answers2