Which is the best way to multiply a large and sparse matrix with its transpose?

Question

I currently want to multiply a large sparse matrix(~1M x 200k) with its transpose. The values of the resulting matrix would be in float.

I tried loading the matrix in scipy's sparse matrix and by multiplying each row of first matrix with the second matrix. The multiplication took ~2hrs to complete.

What is the efficient way to achieve this multiplication? Because I see a pattern in the computation.

The matrix being large and sparse .
The multiplication of a matrix with its transpose. So, the resulting matrix would be symmetric.

I would like to know what libraries can achieve the computation faster. It can be in Python, R, C, C++ or any other one.

@user2357112 by using Scipy? Yes.. It gave Segmentation Fault due to Memory Space. As the resulting matrix is huge with float values in it... — sravan_kumar, Jul 04 '14 at 04:22
http://stackoverflow.com/questions/7477733/matrix-multiplication-for-sparse-matrices-in-python Duplicate? — Jblasco, Jul 04 '14 at 07:15
could you post code? what format did you put the data in? scipy.sparse.csr_matrix? — Back2Basics, Jul 04 '14 at 07:16
@Jblasco: Not a duplicate. Multiplying a matrix by its transpose is a special but common case. E.g. multiplying a vector by its transpose gives its length squared. — MSalters, Jul 04 '14 at 08:14
Msalters, is it so different that the multiplication suggested in that solution will not be optimal, you think? I mean, the fact that the results are particularly interesting doesn't mean the procedure to estimate it is different. Is this one of those cases? — Jblasco, Jul 04 '14 at 08:54

Serge Ballesta · Accepted Answer · 2014-07-04T12:51:06.627

I suppose your main need is to save memory. First as you multiply a matrix with its transpose, you do not need any memeory for the transpose : all of its cells are directly accessible through first matrix (tA[i,j] = A[j,i]). Almost 1/3 of memory saved.

I can see that computation time cannot be neglected too. As the resulting matrix will be symetric, you can compute only one half and directly store the other. Near half of computation time saved.

And if you are sure that you initial matrix is really sparse, and so can hope the resulting one will be too, you can directly store the result in a scipy sparse matrix, COO format : only three lists to store the non null values.

But ... I do not know any library to do that and you will have to code it yourself in your prefered language (probably python as you spoke of scipy).

Python code example (matrix = A[M][N])

I = []
J = []
V = []
for i in range(M):
    for j in range(i:M) :
        X = 0.0
        for k in range(N):
            X += A[i ][k] * A[k][j]
        if X != 0.0 # or abs (X) > epsilon if floating point accuracy is a concern ... 
            I.append (i )
            J.append(j)
            V.append(X)
            I.append (j )
            J.append(i)
            V.append(X)

And I, J, V are what is needed for a scipy COO sparse matrix via :

RESULT = sparse.coo_matrix((V,(I,J)),shape=(N, N))

Yeah this really works... But what if the resulting matrix is not sparse and if it is of 100000 x 100000 size? — sravan_kumar, Jul 07 '14 at 03:20
@sravan_kumar the answer is the answer, no matter how you represent it — Caleth, Aug 13 '18 at 16:06

score 1 · Answer 2 · answered Nov 18 '17 at 11:12

1

def URMdist(URM):
    NLin=URM.shape[0]
    NCol=URM.shape[1]
    URMt=URM.T
    Result = lil_matrix((NLin,NLin))
    for Lin in range(0,NLin):
        X = 0.0
        for Col in range(Lin,NLin):
            X = URM[Col,:].dot(URMt[:,Lin])
            if X != 0.0: 
                Result[Lin,Col] = Result[Col,Lin] = X
    return Result

answered Nov 18 '17 at 11:12

Rodrigo Coelho

61
3

can you double check your code? I believe you need to use NCol somewhere in this function, otherwise it works only if NLin == NCol, which is true only for square matrices – mhnatiuk Jun 04 '18 at 16:45
OK, so this works only if you do: `Result[Lin,Col] = Result[Col,Lin] = X.toarray()[0][0]` – mhnatiuk Jun 04 '18 at 16:52
It's been a while since I posted this, but I'm pretty sure I tested it. The matrix is symetric so since I am calculating huge matrices, I only compute half of it. And yes, this solution is meant only for square matrices. – Rodrigo Coelho Jun 07 '18 at 12:54

rodrigolece · Answer 3 · 2018-08-13T15:41:16.520

The problem in your case is the shear size of you original matrix. If A has in the order of 1e6 rows then A*A.T has on the order of 1e12 entries! There is absolutely no way that you can work with such a matrix in your desktop computer and expect it to be fast. Also, I believe that the problem is not in using Python because the critical loops in Scipy are implemented in C or Fortran (see here).

That being said, I was facing a smaller version of your problem myself (~200k rows, ~3k cols) and I found that Python works quite nicely if you are careful and you use sparse matrices everywhere. (I was breaking the sparsity because I was dividing the sparse matrix by a vector. Don't do this.)

You do the multiplication with the normal overloaded operator * for sparse matrices.

import scipy.sparse as sp
import numpy as np

# ... in my case my sparse matrix is obtained from real data
# ... but we get something similar below

A = sp.rand(int(2e5), int(3e3), density=1e-3, format='csr')

# If used in the jupyter notebook
%time A * A.T      # Wall time: 2.95 s

# Another option
At = A.T.tocsr()
%time A * At       # Wall time: 2.89 s

I found quite interesting that the second option is not significantly faster, even though the documentations says that it's faster to multiply two csr matrices.

Note that the result will also depend on the density of your sparse matrix.

Which is the best way to multiply a large and sparse matrix with its transpose?

3 Answers3

Linked