I wish to terminate a running CUDA kernel (A) incase I wish to run another higher priority kernel (B) immediately. Is this possible (or something like setting a watchdog timer before launching (A) to bound the time (A) runs)?
Also, if possible, would this termination have any impact on the CUDA context of (A) i.e. if (A) completed partially before termination, would I be able to read in that partial output from GPU memory?
Also, what would be the overhead of much a termination?
Edit: How to interrupt or cancel a CUDA kernel from host code doesn't answer my question, because my kernel (A) would not be polling for any "command" from host. The termination command from host is completely asynchronous from the GPU. Also, I would like as little latency between host wanting to terminate (A) and (A) being terminated.