8

I know that Caffe uses GEneral Matrix to Matrix Multiplication (GEMM) which is part of Basic Linear Algebra Subprograms (BLAS) library for performing convolution operations. Where a convolution is converted to matrix multiplication operation. I have referred below article. https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/

I want to understand how other deep learning frameworks like Theano, Tensorflow, Pytorch perform convolution operations. Do they use similar libraries in the backend. There might be some articles present on this topic. If someone can point me to those or can explain with an answer.

PS: I posted the same question on datascience.stackexchange.com. As I didn't get a reply there, I am posting it here as well. If there is a better forum to post this question please let me know.

Gaurav Srivastava
  • 505
  • 1
  • 7
  • 17

1 Answers1

8

tensorflow has multiple alternatives for the operations.

for GPU, cuda support is used. Most of the operations are implemented with cuDNN, some use cuBLAS, and others use cuda.

You can also use openCL instead of cuda, but you should compile tensorflow by yourself.

for CPU, intel mkl is used as the blas library.

I'm not familiar with pytorch and theano, but some commonly used blas libraries are listed below:

  • cuDNN, cuBLAS, and cuda: nvidia GPU support, most popular library
  • openCL: common GPU support, I don't know about it at all.
  • MKL: CPU blas library provided by intel
  • openBLAS: CPU library
Kaihong Zhang
  • 419
  • 3
  • 9
  • 2
    To add for PyTorch: [The documentation](https://pytorch.org/docs/stable/torch.html) mentions some specific BLAS/LAPACK operations, e.g. `addbmm`, and additionally, it is helpful to mention that some operations fall back t NumPy, which can have a configurable BLAS backend (e.g., the version shipped with Anaconda Python has natively Intel MKL enabled) – dennlinger Aug 13 '18 at 06:35
  • @dennlinger: unless Kaihong Zhang updates his answer to integrate your comment, I'd say it is worth its own answer... :) – benjaminplanche Aug 13 '18 at 09:11
  • I'm not so familiar with PyTorch, could you please add a new answer to explain it? @dennlinger – Kaihong Zhang Aug 13 '18 at 10:32
  • Alright, will do this maybe later today. Also it seems that specifically the operations asked about (Convolution) are implemented in a C backend (for PyTorch). More in my answer. – dennlinger Aug 13 '18 at 10:47
  • 1
    I was going through this blog (http://www.goldsborough.me/cuda/ml/cudnn/c++/2017/10/01/14-37-23-convolutions_with_cudnn/) on the implementation of convolution using cuDNN. It seems all the high-level deep learning libraries use cuDNN convolution function, which has three ways to implement convolution: CUDNN_CONVOLUTION_FWD_ALGO_GEMM (models convolution as an explicit matrix multiplication), CUDNN_CONVOLUTION_FWD_ALGO_FFT (uses a Fast Fourier Transform for the convolution) and CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD (employs the Winograd algorithm to perform the convolution). – Gaurav Srivastava Aug 14 '18 at 05:09
  • @KaihongZhang From my last comment I understand that all the high-level deep learning libraries can use GEMM for convolution through cuDNN, if CUDNN_CONVOLUTION_FWD_ALGO_GEMM option is selected. However, I am not sure what kind of BLAS implementation does cuDNN uses, or it is a BLAS implementation of its own? – Gaurav Srivastava Aug 14 '18 at 05:17
  • cuDNN is not open source, nvidia never tells us what it is like. – Kaihong Zhang Aug 14 '18 at 06:39
  • 2
    https://stackoverflow.com/questions/41518379/why-was-eigen-chosen-for-tensorflow says tensorflow uses 'Eigen' instead of blas. (Blas can be accelerated by cublas. and some frameworks can use cudnn, a higher level accelleration using cuda) – Chan Kim May 18 '20 at 05:39