Why is dgemm and sgemm much slower (200x) than numpy's dot? Is it expected and normal?
The following is the code I use to test:
from scipy.linalg import blas
import numpy as np
import time
x2 = np.zeros((1000000, 512))
x1 = np.zeros((1, 512))
t1 = time.time()
for i in range(10):
np.dot(x1, x2.T)
t2 = time.time()
print("np.dot: ", t2-t1)
t1 = time.time()
for i in range(10):
blas.dgemm(alpha=1.0, a=x1, b=x2, trans_b=True)
t2 = time.time()
print("dgemm: ", t2-t1)
t1 = time.time()
for i in range(10):
blas.sgemm(alpha=1.0, a=x1, b=x2, trans_b=True)
t2 = time.time()
print("sgemm: ", t2-t1)
The result I got is:
np.dot: 0.1820526123046875
dgemm: 34.11782765388489
sgemm: 25.33052659034729
The following is my scipy's config which shows that it is compiled with OpenBLAS:
>>> import scipy
>>> scipy.__config__.show()
openblas_lapack_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
blas_mkl_info:
NOT AVAILABLE
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
The following is my numpy config which is largely the same as scipy's:
>>> import numpy
>>> numpy.__config__.show()
blas_mkl_info:
NOT AVAILABLE
blis_info:
NOT AVAILABLE
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
NOT AVAILABLE
openblas_lapack_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
Am I using it wrong?