30

Is it possible to stop all running processing using the GPU via CUDA, without restarting the machine?

einpoklum
  • 118,144
  • 57
  • 340
  • 684
Christopher Dorian
  • 2,163
  • 5
  • 21
  • 25
  • you could always change the permissions temporarily of /dev/nvidiaxx, I haven't tried it but I believe that would kill the jobs instantly. I don't know anyway of specifying jobs specifically running on the gpu unless you were using some kind of queue or load leveler. – Marm0t Dec 07 '10 at 01:55

4 Answers4

29

The lsof utility will help with this. You can get a list of processes accessing your NVIDIA cards with:

lsof /dev/nvidia*

Then use kill or pkill to terminate the processes you want. Note that you may not want to kill X if it's running. On my desktop system, both X and kwin are also accessing the GPU.

i_grok
  • 638
  • 7
  • 9
  • 1
    This does not work for me. Killing my kernel process has no effect. The kernel process is indefinitely consuming the GPU and I can't kill it. – thatWiseGuy Apr 08 '17 at 21:45
16

Long answer:

lsof /dev/nvidia*

gives you PIDs running on your GPU card which looks something like: lsof: status error on PID: No such file or directory

COMMAND  PID    USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
python  7215 *******  mem    CHR 195,255           434 /dev/nvidiactl
python  7215 *******  mem    CHR   195,0           435 /dev/nvidia0

and

awk '{print $2}'

selects the PID column (in my case it is the second column) and

xargs -I {} kill {}

kills those PID jobs.

Short answer:

You may use the following command to remove them all at once.

Watch out! This command will delete all PIDs showing up for lsof /dev/nvidia*. Do run lsof /dev/nvidia* first to confirm these jobs are the ones you want to delete.

lsof /dev/nvidia* | awk '{print $2}' | xargs -I {} kill {}

Finish the job by a single command.

user1165814
  • 405
  • 4
  • 5
13

you can check the processes with nvidia-smi and then

kill -9 <pid>
Michele
  • 2,796
  • 2
  • 21
  • 29
Oliver Nina
  • 655
  • 6
  • 4
1

You can use the fuser command to get the all the processes using CUDA and then kill them. There's also a nice single command to kill them all.

sudo fuser -k /dev/nvidia*
Soma Siddhartha
  • 191
  • 1
  • 5
  • Note that unlike some other commands such as `kill`, the signal sent by `fuser -k` by default is `SIGKILL` instead of `SIGTERM`. – ebk Dec 14 '22 at 00:48