1

I'm using Python/NumbaPro to use my CUDA complient GPU on a windows box. I use Cygwin as shell and from within a cygwin console it has no problems finding my CUDA device. I test with the simple command

    numbapro.check_cuda()

But when I'm connection to the box over OpenSSH (as part of my Cygwin setup), I get the following error:

numba.cuda.cudadrv.error.CudaSupportError: Error at driver init:
Call to cuInit results in CUDA_ERROR_NO_DEVICE:

How to fix this?

  • is sshd running as a service, if so, that is your problem – talonmies Jun 30 '15 at 12:57
  • Thanks. I've changed the sshd context from windows service to command line, now running in the same user context as my other user. This results in a new error: "raise NotImplementedError('cannot determine number of cpus')" - why can that be? – Per Erik Gransøe Jul 01 '15 at 08:30
  • That isn't really anything to do with CUDA, but I think [this](http://stackoverflow.com/q/13544826/681865) probably answers that question (see the last answer). – talonmies Jul 01 '15 at 08:55
  • Thanks again! It flies now! – Per Erik Gransøe Jul 01 '15 at 10:47
  • I'll add a short summary answer to the question if you will accept it so we can get this off the unanswered list – talonmies Jul 01 '15 at 10:51

1 Answers1

1

The primary cause of this is Windows service session 0 isolation. When you run any application via a service which runs in session 0 (so sshd, or windows remote desktop, for example), the machines native display driver is unavailable. For CUDA applications, this means that you are get a no device available error at runtime because the sshd you use to login is running as a service and there is no available CUDA driver.

The are a few workarounds:

  1. Run the sshd as a process rather than a service.
  2. If you have a compatible GPU, use the TCC driver rather than the GPU display driver.

On the secondary problem, the Python runtime error you are seeing comes from the multiprocessing module. From this question it appears that the root cause is probably the NUMBER_OF_PROCESSORS environment variable not being set. You can use one of the workarounds in that thread to get around that problem

Community
  • 1
  • 1
talonmies
  • 70,661
  • 34
  • 192
  • 269
  • 1
    I think it would be interesting to test whether this (session 0 isolation limitation) is still true with a very recent NVIDIA windows driver, such as 353.30. NVIDIA has worked around some of these limitations. – Robert Crovella Jul 01 '15 at 14:41