NumPy + BLAS + LAPACK on GPU (AMD and Nvidia)

Question

We have a Python code which involves expensive linear algebra computations. The data is stored in NumPy arrays. The code uses numpy.dot, and a few BLAS and LAPACK functions which are currently accessed through scipy.linalg.blas and scipy.linalg.lapack. The current code is written for CPU. We want to convert the code so that some of the NumPy, BLAS, and LAPACK operations are performed on a GPU.

I am trying to determine the best way to do this is. As far as I can tell, Numba does not support BLAS and LAPACK functions on the GPU. It appears that PyCUDA may the best route, but I am having trouble determining whether PyCUDA allows one to use both BLAS and LAPACK functions.

EDIT: We need the code to be portable to different GPU architectures, including AMD and Nvidia. While PyCUDA appears to offer the desired functionality, CUDA (and hence, PyCUDA) cannot run on AMD GPUs.

AMD GPGPU is in a bit of a state of flux. History, AMD was behind OpenCL and clBLAS, but now they're focused on CUDA-compatibility via HIP and hipBLAS. None of it is exactly mature, but I expect (hope) that frameworks like PyCUDA will become vendor-agnostic. It's not clear how much of a future OpenCL will have. — Aleksandr Dubinsky, Nov 10 '17 at 13:14

score 4 · Answer 1 · answered Nov 09 '17 at 19:42

4

Have you tried looking into scikit-cuda? https://scikit-cuda.readthedocs.io/en/latest/#

It seems to use pyCUDA to access CUDA-toolkit libraries (CUBLAS, CUSOLVER, CUFFT, CULA) as well as provide their own implementation of some LAPACK routines based on CUBLAS.

I have used it for CUBLAS routines and it was a pleasant experience, I hope it would be the same for LAPACK

answered Nov 09 '17 at 19:42

Jacek Golebiowski

65
6

Thanks, this seems like a viable option. I'm curious if others have taken different approaches. – srcerer Nov 09 '17 at 20:14
I think this would work for Nvidia GPUs. We need the code to run on AMD GPUs as well, which do not work with CUDA. – srcerer Nov 09 '17 at 20:35
1

I am sorry, I have very little experience with AMD GPUs, I think you might want to look into packages based on pyOpenCL (an equivalent of Pycuda for heterogeneous architectures): https://mathema.tician.de/software/pyopencl/ https://pyclblas.readthedocs.io/en/latest/index.html# Also, have you tried looking into ML-oriented libraries such as theano? the offer support for GPUs and some linear algebra routines – Jacek Golebiowski Nov 10 '17 at 11:21

score 0 · Answer 2 · answered Jan 04 '18 at 17:57

Another option is ArrayFire. While this package does not contain a complete BLAS and LAPACK implementation, it does offer much of the same functionality. It is compatible with OpenCL and CUDA, and hence, is compatible with AMD and Nvidia architectures. It has wrappers for Python, making it easy to use.

NumPy + BLAS + LAPACK on GPU (AMD and Nvidia)

2 Answers2