4

I've been trying to get PyOpenCL and PyCUDA running on a Linux Mint machine. I have things installed but the demo scripts fail with the error:

pyopencl.cffi_cl.LogicError: clgetplatformids failed: PLATFORM_NOT_FOUND_KHR

Configuration

$ uname -a && cat /etc/lsb-release && lspci | grep NV

    Linux 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
    DISTRIB_DESCRIPTION="Linux Mint 17.3 Rosa"
    01:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 730] (rev a1)

Relevant installed packages:

libcuda1-352-updates
libcudart5.5:amd64
nvidia-352-updates
nvidia-352-updates-dev
nvidia-cuda-dev
nvidia-cuda-toolkit
nvidia-opencl-icd-352-updates
nvidia-profiler
nvidia-settings
ocl-icd-libopencl1:amd64
ocl-icd-opencl-dev:amd64
opencl-headers
python-pycuda
python-pyopencl
python3-pycuda
python3-pyopencl

Research

This post describes a scenario in which the package-manager installed opencl/cuda implementation don't set up some simlinks correctly. That issue doesn't seem to be present on my system.

There was a version number mismatch between the graphics drivers (were nvidia-340) and the nvidia-opencl package (352). I update the graphics drivers to nvidia-352-updates-dev but the issue remains.

There is a bug in Arch linux that seems to revolve around the necessary device files not being created. However, I've verified that the /dev/nvidia0 and /dev/nvidiactl exist and have permissions 666, so they should be accessible.

Another Stackoverflow post suggests running the demos as root. I have tried this and the behavior does not change.

Older installation instructions for cuda/opencl say to download drivers directly from the NVidia website. I'm not sure this still applies, so I'm holding off on that for now since there seem to be plenty of relevant packages in the respositories.

The same error, but for an ATI card on a different linux system, was resolved by putting proper files in /usr/lib/OpenCL/vendors. That path isn't used on my system, However, I do have /etc/OpenCL/vendors/nvidia.icd which contains the line libnvidia-opencl.so.1, suggesting my issue is dissimilar.

This error has been observed on OSX, but for unrelated reasons. Similar error messages for PyCUDA also appear to be unrelated.

This error can occur under remote access since the device files are not initialized if X is not loaded. However, I'm testing this in a desktop environment. Furthermore, I've run the manual commands suggested in that thread just to be sure, and they are redundant since the relevant /dev entries already exist.

There is a note here about simply running a command a few times to get around some sort temporary glitch. That doesn't seem to help.

This post describes how the similar cuInit failed: no device CUDA error was caused by not having the user in the video group. To check, I ran usermod -a -G video $USER, but it did not resolve my issue.

In the past, routine updates have broken CUDA support. I have not taken the time to explore every permutation of package version numbers, and it's possible that downgrading some packages may change the situation. However, without further intuition about the source of the issue, I'm not going to invest time in doing that since I don't know whether it will work.

The most common google search result for this error, appearing four times on the first pages, is a short and unresolved email thread on the PyOpenCL list. Checking the permissions bits for /dev/nvidia0 and /dev/nvidiactl is suggested. On my machine user/group/other all have read and write access to these devices, so I don't think that's the source of the trouble.

I've also tried building and installing PyOpenCL form the latest source, rather than using the version in the repositories. This is failing at an earlier phase which suggests to me it is not building correctly.

Summary

The issue would appear to be that PyCUDA/PyOpenCL cannot locate the graphics card. There are several known issues that can cause this, but none of them seem to apply here. I'm missing something, and I'm not sure what else to do.

Community
  • 1
  • 1
MRule
  • 529
  • 1
  • 6
  • 18
  • 1
    Can you install the nvidia cuda sdk and run the examples? In particular, the deviceQuery function will let you know if you can talk to the device independent of whether your OpenCL infrastructure is properly configured. The other approach is to install the Intel OpenCL drivers to see if you can get pyopencl talking to the CPU. I did have problems when both nvidia and amd drivers were installed in /etc/OpenCL/vendors, but that was with the amd driver not the nvidia driver. My nvidia.icd file contains the full path /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1 for Ubuntu 14.04. – Neapolitan May 16 '16 at 13:33
  • I installed `cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb` but haven't been able to find the command `deviceQuery`. – MRule May 16 '16 at 17:48
  • I attempted to reinstall from `cuda_7.5.18_linux.run`. The installer noted that my machine's configuration was unsupported. I proceeded anyway as a last-ditch effort. Indeed, now the X configuration is completely broken and the machine cannot boot to the login screen. – MRule May 17 '16 at 09:50
  • As it appears that this is a configuration issue, and not related to the OpenCL/CUDA API or implementation thereof, it should be moved to Super User or Unitx & Linux. – MRule May 17 '16 at 10:34
  • 1
    I feel for you, having done this to myself more than once. I don't remember off hand how I uninstalled nvidia's drivers, so hopefully they can help you over at Super User. When you get X back and are ready to continue, use `$ locate deviceQuery` to find the source directory, or `$ dpkg -L cuda-repo-[Tab completion]` to list the files installed from the .deb. – Neapolitan May 17 '16 at 13:40
  • I wonder if there is some way to just call up NVidia about this. Somehow it seems that the state of OpenCL/CUDA on linux is worse than when I tried it 7 years ago. – MRule May 17 '16 at 13:59
  • 1
    I have the nvidia-352 driver running on my machine (not nvidia-352-updates) and it is supporting my GeForce GTX 980 Ti (GK110) without issue. I used nvidia's repos for apt. You can configure them by installing the deb file from https://developer.nvidia.com/cuda-downloads. Note that deviceQuery is available from the cuda-samples-7-5 package. This installs the source in /usr/local/cuda-7.5/samples/, and you will need to run make to build the executable. – Neapolitan May 17 '16 at 16:02
  • I've installed `cuda-samples-7-5` but `deviceQuery` is not available on my PATH. Additionally, the cuda files, etc, aren't showing up. I must be confused. Once the repository is added, a successful execution of `sudo apt-get install cuda-samples-7-5` should suffice, no? – MRule May 17 '16 at 16:15
  • 1
    These are source examples. You will either need to `sudo make` from /usr/local/cuda-7.5/samples, or copy the tree to your local directory and run make. – Neapolitan May 17 '16 at 16:20
  • `deviceQuery` returns the card information but python opencl is still returning `clGetPlatformIDs failed: platform not found khr`. This is commonly caused by permissions issues on the GPU device files, but the error appears even when run as root. – MRule May 17 '16 at 17:12
  • 1
    I have nvidia-opencl-icd-352 installed. This installed the nvidia opencl drivers and put /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1 into /etc/OpenCL/vendors/nvidia.icd. – Neapolitan May 17 '16 at 19:07
  • 1
    You can find a "hello world" program in c on opencl on the net and try that; that will at least separate opencl issues from pyopencl issues. Also, the intel opencl driver does support the cpu, though this might add more confusion long term since you then have multiple devices to choose from when starting opencl. – Neapolitan May 17 '16 at 19:49
  • It worked, after installing `nvidia-modprobe` and rebooting. I'm not sure I could reproduce all the steps I did, but I should really document all of this for the future. I may place that as a self-answered question on Linux and Unix, since I'm not sure this question should stay on StackOverflow. Thank you all for your help and patience, it's immensely appreciated. – MRule May 18 '16 at 10:13

0 Answers0