I am using ROCm software stack to compile and run OpenCL programs on a Polaris20 GCN4th AMD GPU and wondering if there is a way to find out which compute unit (id) on GPU is in use now by the current work-item or wavefront?
In other words, can I associate a computation in a kernel to a specific compute unit or specific hardware on GPU, so I can keep track of which part of the hardware is getting utilized while a kernel runs.
Thank you!