2

In a Linux system with multiple GPUs, how can you determine which GPU is running X11 and which is completely free to run CUDA kernels? In a system that has a low powered GPU to run X11 and a higher powered GPU to run kernels, this can be determined with some heuristics to use the faster card. But on a system with two equal cards, this method cannot be used. Is there a CUDA and/or X11 API to determine this?

UPDATE: The command 'nvidia-smi -a' shows a whether a "display" is connected or not. I have yet to determine if this means physically connected, logically connected (running X11), or both. Running strace on this command shows lots of ioctls being invoked and no calls to X11, so assuming that the card is reporting that a display is physically connected.

talonmies
  • 70,661
  • 34
  • 192
  • 269
dwelch91
  • 399
  • 4
  • 11
  • Why can't a GPU be running both X *and* CUDA? X doesn't take that much processing. – Ignacio Vazquez-Abrams Jun 21 '11 at 16:59
  • If you run kernels on the GPU that is running X11, you can't run the debugger. Also, when running on the same GPU, if the kernel you are working on freezes, X11 also hangs causing the display to lock. – dwelch91 Jun 21 '11 at 17:16
  • 2
    Isn't the one running X11 the one with the display attached? It should have run time limit on the kernel (which you can check with device properties) while the other card should have no run time limit (I think this holds true on Linux too, not only on Windows) – jmsu Jun 21 '11 at 19:02

2 Answers2

2

There is a device property kernelExecTimeoutEnabled in the cudaDeviceProp structure which will indicate whether the device is subject to a display watchdog timer. That is the best indicator of whether a given CUDA device is running X11 (or the windows/Mac OS equivalent).

In PyCUDA you can query the device status like this:

In [1]: from pycuda import driver as drv

In [2]: drv.init()

In [3]: print drv.Device(0).get_attribute(drv.device_attribute.KERNEL_EXEC_TIMEOUT)
1

In [4]: print drv.Device(1).get_attribute(drv.device_attribute.KERNEL_EXEC_TIMEOUT)
0

Here device 0 has a display attached, and device 1 is a dedicated compute device.

talonmies
  • 70,661
  • 34
  • 192
  • 269
0

I don't know any library function which could check that. However a one "hack" comes in mind: X11, or any other system component that manages a connected monitor must consume some of the GPU memory.

So, check if both devices report the same amount of available global memory through 'cudaGetDeviceProperties' and then check the value of 'totalGlobalMem' field. If it is the same, try allocating that (or only slightly lower) amount of memory on each of the GPU and see which one fails to do that (cudaMalloc returning an error flag).

Some time ago I read somewhere (I don't remember where) that when you increase your monitor resolution, while there is an active CUDA context on the GPU, the context may get invalidated. That hints that the above suggestion might work. Note however that I never actually tried it. It's just my wild guess.

If you manage to confirm that it works, or that it doesn't, let us know!

CygnusX1
  • 20,968
  • 5
  • 65
  • 109