Numpy with mpmath uses single thread for matrix multiplication

Question

I'm working on a numpy matrix adj which represent the adjacency matrix of some networkx graph. When I construct adj as follows:

adj = sparse.csr_matrix(nx.adjacency_matrix(graph), dtype='longdouble').todense()

and later run adj = adj ** 2, then I can see in htop that numpy uses all available threads.

However, because of the precision loss, I attempted to integrate mpmath somewhere in between.

I did it like this:

mp.dps = 120
adj = sparse.csr_matrix(nx.adjacency_matrix(graph), dtype='longdouble').todense()
# ... just like before
adjmp = mp.matrix(adj)
# this casts all values to mpf
adj = np.matrix(adjmp, dtype=object)
# and get back the np matrix, now with mpfs inside

The resulting adj looks like this

matrix([[mpf('0.0'), mpf('0.0'), mpf('0.0'), ..., mpf('0.0'), mpf('0.0'),
     mpf('0.125')], #  [...]

which is what I expect.

The computation consists of two steps: the first is squaring adj and the second is the actual computation. From the results, I can see that the precision is much greater, but htop shows that the squaring step is running only on one thread, for some reason.

When I run np.show_config(), I get:

blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

This sparse matrix is not a numpy array subclass, so `mpmath` cannot handle it. Sparse has its own multiplication methods. — hpaulj, Dec 22 '18 at 16:04
Sorry, I forgot to mention that I use adj.todense() before the conversion. I'll update the question. — A.Budziak, Dec 22 '18 at 16:23
The object dtype array uses slow python iteration for math, passing the `**2` task to each `mpf` object individually (as in a list comprehension). — hpaulj, Dec 22 '18 at 16:38
Is there any other solution for the problem of squaring matrices with high precision in parallel? — A.Budziak, Dec 22 '18 at 17:41
Just to clear, you are talking about element by element squares, not some sort of matrix power, right? — hpaulj, Dec 22 '18 at 17:47
No, I'm talking power in terms of matrix multiplication (squaring whole matrix, not its elements). That's what happens for dense matrices when you use ** operator, I guess. — A.Budziak, Dec 22 '18 at 18:22
There's a tendency to discourage the use of `np.matrix`, in part because of confusion on issues like this. In any case `M**2` delegates to `np.matrix.__pow__` method, which in turn uses `np.linalg.matrix_power`. For object dtype and power 2 it uses `np.dot(M, M)`. For numeric dtype `dot` uses fast `BLAS` routines. For object dtype it uses a slower iteration. — hpaulj, Dec 22 '18 at 19:53
`adj = ().todense()` is a `np.matrix`, so there's no need for further `np.matrix(adj)`. Since the rest is about dense matrix power the initial sparse step was a bit of a distraction. I don't know `networkx`, but is the intermediate sparse step necessary? — hpaulj, Dec 22 '18 at 20:14

Numpy with mpmath uses single thread for matrix multiplication

0 Answers0