0

I have two mpi processes server and client which communicate via MPI. A server can handle multiple clients. I am using MPICH 1.4 and CUDA 6.5 on Windows 7 x64 machine.

On my localhost, I usually run one client instance and one server instance.

When I execute the following command through a command prompt started with administrator privileges:

mpiexec.exe -n 1 Server.exe : -n 1 Client.exe

the processes do start executing, but after a while I get the following CUDA error:

CUDA error at E:\github\project\src\cudahelper.cpp:71 code=38(cudaErrorNoDevice) "cudaGetDeviceCount( &_devCount )"

I then re-configured smpd and mpiexec using following guide with my latest domain user account and password credentials. I then double checked in Services and restarted smpd service, but I continue to get the same error.

After some R&D I figured out that if I instead run the mpi processes as localonly using the following command, then I get NO CUDA errors:

mpiexec.exe -localonly -n 1 Server.exe : -localonly -n 1 Client.exe

Any ideas what is causing CUDA errors when launching MPI processes as non-local processes.

nurabha
  • 1,152
  • 3
  • 18
  • 42
  • It's probably related to the fact that [windows WDDM display devices are only accessible from certain service levels in windows](http://www.mathworks.com/matlabcentral/answers/98712-why-am-i-not-able-to-access-a-gpu-card-on-a-windows-vista-windows-7-machine-managed-under-job-manage). Are you running a GeForce or Quadro GPU that is acting as a WDDM device? – Robert Crovella May 25 '15 at 11:13
  • @RobertCrovella: Yes I am using GeForce 780 GPU's on Windows 7 x64. AFAIK, TCC can't be enabled for these GPU's using SMI utility. Can I still get around this problem in some other way ? – nurabha May 25 '15 at 12:15
  • @RobertCrovella: Actually this is only a MPI specific issue. Non-MPI CUDA code runs fine on my machine. Are you sure this issue is related to WDDM drivers ? – nurabha May 25 '15 at 12:19

0 Answers0