What is the fastest way to compute a sparse Gram matrix in Python?

Question

A Gram matrix is a matrix of the structure X @ X.T which of course is symmetrical. When dealing with dense matrices, the numpy.dot product implementation is intelligent enough to recognize the self-multiplication to exploit the symmetry and thus speed up the computations (see this). However, no such effect can be observed when using scipy.sparse matrices:

random.seed(0)
X = random.randn(5,50)
X[X < 1.5] = 0
X = scipy.sparse.csr_matrix(X)
print(f'sparsity of X: {100 * (1 - X.count_nonzero() / prod(X.shape)):5.2f} %')
# sparsity of X: 92.00 %

%timeit X @ X.T
# 248 µs ± 10.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

X2 = X.copy()
%timeit X @ X2.T
# 251 µs ± 9.38 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

So I was wondering: What is the fastest way to compute a sparse Gram matrix in Python? Notably, it is sufficient to only compute the lower (or equivalently, the upper) triangle.

I've read multiple times, that using the skyline format is very efficient for symmetrical matrices, however, scipy doesn't support the skyline format. Instead, people were pointing towards pysparse many times, but it seems that pysparse has been discontinued a long time ago and there is no support for Python 3. At least, my Anaconda rejects to install pysparse due to compatibility issues with Python 3.

Sparse matrix transposition in scipy produces a new matrix, so there is no way to exploit symmetry like in numpy, where transposition is only flags applied to a view — talonmies, May 18 '20 at 11:32
@talonmies Are you sure? `X_coo = X.tocoo(); X_coo.data is X_coo.T.data` is `True` — theV0ID, May 18 '20 at 11:53
Positive. The transpose of CSR matrix is a CSC matrix -- https://github.com/scipy/scipy/blob/bf4e01b5862a8f20dd79f799ac2330f40cb93897/scipy/sparse/csr.py#L135 . And the resulting CSC matrix has its own indices which are not detectable as the same as the source CSR matrix. — talonmies, May 18 '20 at 12:02
The fastest way is `mkl_sparse_syrk` from Intel's mkl library, although it's a C function and calling it from python isn't trivial — CJR, May 18 '20 at 13:13
Can you try storing X2 = (X.T).copy() and then compute X@X2? It seems to be a bit faster. — Mercury, May 18 '20 at 14:05

theV0ID · Accepted Answer · 2020-05-19T12:39:21.277

Thanks to the comment of the user CJR, I worked out a satisfying solution. In fact, I found a library on GitHub which wraps the MKL routine mkl_sparse_spmm for Python. This routine is for fast multiplication of two sparse matrices. So all I had to do was to extend the library and provide a similar wrapper for mkl_sparse_syrk. And this is exactly what I did.

I still have to add some comments, afterwards I will submit a pull request to the original project.

However, here are the performance results, quite impressing:

random.seed(0)
X = random.randn(500, 5000)
X[X < 0.8] = 0
X = scipy.sparse.csr_matrix(X)
print(f'X sparsity: {100 * (1 - X.count_nonzero() / prod(X.shape)):5.2f} %')
# X sparsity: 78.80 %

expected_result = (X @ X.T).toarray()
expected_result_triu = expected_result.copy()
expected_result_triu[tril_indices(expected_result.shape[0], k=-1)] = 0

mkl_result1 = sparse_dot_mkl.dot_product_mkl(X, X.T)
allclose(mkl_result1.toarray(), expected_result)
# True

mkl_result2 = sparse_dot_mkl.dot_product_transpose_mkl(X)
allclose(mkl_result2.toarray(), expected_result_triu)
# True

%timeit X @ X.T
# 197 ms ± 5.21 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit sparse_dot_mkl.dot_product_mkl(X, X.T)
# 70.6 ms ± 593 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit sparse_dot_mkl.dot_product_transpose_mkl(X)
# 34.2 ms ± 421 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Using the generic dot product from MKL instead of the dot product implementation from scipy yields a speed-up of 279%. Using the specialized product for Gram matrix computation yields a speed-up of 576%. This is huge.

What is the fastest way to compute a sparse Gram matrix in Python?

1 Answers1

Linked