0

I'm trying to get a handle on SYCL's kernel_bundle functionality. My target is to use it as a db storage for my AOT precompiled custom kernels. The desirable result is to precompile these kernels into a stand alone library for other libraries to query for a specific implementations at runtime, and pass it around in an ABI-safe way across different compilation units.

I've hit a couple of roadblocks:

  1. It seems the constructors for both sycl::kernel and sycl::kernel_bundle are either marked deleted or private.
  2. I'm a bit muddled about the proper workflow: How do I instantiate a kernel, bundle it, build this bundle, fetch the kernel using its ID, and then execute it with specialization constants?

I looked into the following specification https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html but it seems like some of the examples miss the point of using a kernel_bundle, for instance examples in 4.11.15.1 and 4.11.15.2 . They seem to get a kernel_bundel but then proceed and execute another inline implementation of kernels.

Would appreciate if someone could shed light on this or share a comprehensive example illustrating the AOT compilation and runtime specialization process using kernel_bundle with custom kernels in SYCL.

Thanks in advance!

I was expecting something along these lines:

class AddKernel {
public:
    AddKernel(sycl::global_ptr<int> a, sycl::global_ptr<int> b, sycl::global_ptr<int> c)
        : a_(a), b_(b), c_(c) {}

    void operator()(sycl::id<1> idx) {
        c_[idx] = a_[idx] + b_[idx];
    }

private:
    sycl::global_ptr<int> a_;
    sycl::global_ptr<int> b_;
    sycl::global_ptr<int> c_;
};

int main() {
    sycl::queue q;

    // Create data for the kernel
    std::vector<int> a = {1, 2, 3, 4, 5};
    std::vector<int> b = {5, 4, 3, 2, 1};
    std::vector<int> c(5);

    // Create buffers for the data
    sycl::buffer<int> buf_a(a.data(), a.size());
    sycl::buffer<int> buf_b(b.data(), b.size());
    sycl::buffer<int> buf_c(c.data(), c.size());

    // 1st approach:
    sycl::kernel add(AddKernel);
    auto bundle = add.get_kernel_bundle();

    //2nd approach
    auto bundle = sycl::get_kernel_bundle<sycl::bundle_state::input>
                                   (q.get_context(), {sycl::get_kernel_id<AddKernel>()});

    auto built = sycl::build(bundle);

    sycl::use_kernel_bundle(built);

    auto AddKernel = bundle.get_kernel_ids("add");

    // Submit the kernel for execution using the kernel_bundle
    q.submit([&](sycl::handler& h) {
         auto acc_a = buf_a.get_access<sycl::access::mode::read>(h);
         auto acc_b = buf_b.get_access<sycl::access::mode::read>(h);
         auto acc_c = buf_c.get_access<sycl::access::mode::write>(h);

         // Create an instance of the AddKernel functor with the necessary data
         auto addKernel AddKernel(acc_a.get_pointer(), acc_b.get_pointer(), acc_c.get_pointer());

         // Execute the kernel using the kernel_bundle
         //h.parallel_for<AddKernel>(sycl::range<1>(a.size()), addKernel);
     });

     // Wait for the kernel execution to complete
     q.wait();

     // Print the results
     for (int i = 0; i < c.size(); i++) {
         std::cout << a[i] << " + " << b[i] << " = " << c[i] << std::endl;
     }

    return 0;
}

0 Answers0