-1

I am trying to enable multithreading/multiprocessing in an Anaconda installation of Numpy. My test program is the following:

import os
import numpy as np
from timeit import timeit

size = 1024
A = np.random.random((size, size)),
B = np.random.random((size, size))
print 'Time with %s threads: %f s' \
      %(os.environ.get('OMP_NUM_THREADS'),
        timeit(lambda: np.dot(A, B), number=4))

I change the environmental variable OMP_NUM_THREADS, but regardless its value, it always takes the same amount of time to run the code and always a single core is being used.

It appears that my Numpy is linked against OpenBlas:

numpy.__config__.show()
lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/home/myuser/anaconda3/envs/py2env/lib']
    define_macros = [('HAVE_CBLAS', None)]
    language = c
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/home/myuser/anaconda3/envs/py2env/lib']
    define_macros = [('HAVE_CBLAS', None)]
    language = c
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/home/myuser/anaconda3/envs/py2env/lib']
    define_macros = [('HAVE_CBLAS', None)]
    language = c
blis_info:
  NOT AVAILABLE
openblas_lapack_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/home/myuser/anaconda3/envs/py2env/lib']
    define_macros = [('HAVE_CBLAS', None)]
    language = c
lapack_mkl_info:
  NOT AVAILABLE
blas_mkl_info:
  NOT AVAILABLE

and this is my relevant part of conda list:

conda list | grep blas
blas                      1.1                    openblas    conda-forge
libblas                   3.9.0           1_h6e990d7_netlib    conda-forge
libcblas                  3.9.0           3_h893e4fe_netlib    conda-forge
numpy                     1.14.6          py27_blas_openblashd3ea46f_200  [blas_openblas]  conda-forge
openblas                  0.2.20                        8    conda-forge
scikit-learn              0.19.2          py27_blas_openblasha84fab4_201  [blas_openblas]  conda-forge

I also tried setting OPENBLAS_NUM_THREADS but did make any difference. I use a Python2.7 environment in conda 4.12.0.

Botond
  • 2,640
  • 6
  • 28
  • 44
  • Numpy random number generation probably doesn’t use multiple cores, but other operations do. Try with matrix multiplication. – jkr Nov 28 '22 at 18:27
  • @jkr, this is matrix multiplication of two random matrices. np.dot() is called, that's what I expect to be parallel. – Botond Nov 28 '22 at 20:03

1 Answers1

0

If you want to build parallel processing, you will have to break down the problem and use python's multi-threading or multi-processing tools to implement it. Here is a scipy doc on how to get started.

If you need to do more sophisticated calculations you can also consider using mpi4py. If most of your calculations are just numpy calculations, I would also consider using dask. It helps chunk and parallelizes your code and lots of the numpy functions are already supported.

spo
  • 304
  • 1
  • 7
  • 2
    I don’t think this is true. Some numpy operations do use multiple cores. – jkr Nov 28 '22 at 18:26
  • there's `np.dot()` that should be parallel by `numpy` – Botond Nov 28 '22 at 20:06
  • Maybe I am wrong about it not being built in for some fuctions- edited my answer to reflect this. For building anything more complicated, I think the rest of the answer still stands. – spo Nov 28 '22 at 20:28