In OpenCL world there is function clGetEventProfilingInfo which returns all profiling info of event like queued, submitted, start and end times in nanoseconds. It is quite convenient because I'm able to printf
that info whenever I want.
For example with PyOpenCL it is possible to write code like this
profile = event.profile
print("%gs + %gs" % (1e-9*(profile.end - profile.start), 1e-9*(profile.start - profile.queued)))
which is quite informative for my task.
Is it possible to get such information in code instead of using external profiling tool like nvprof and company?