Numpy - Matrix multiplication faster with many zero-entries?

Question

Problem: In numpy, I have a matrix M1 that I am multiplying with another matrix M2.

I know that I can spare half of the values in M1 because the resulting matrix will be symmetric and I only need the top k values.

So I am thinking to use numpy.tril to zero out half the values, hoping that the underlying C-functions will be faster for multiplications a*b where a==0 as they can stop right at seeing a==0 instead of doing the whole float operation.

I know I can time this but I think this is a question of general interest.

Note that M1 is not sparse, just that half of it needs not be considered.

Perhaps there is even a better way to save 50% computation?

Background: This has to do with

p(A|B)*p(B) == p(B|A)*p(A)

... if you see what I mean.

Example: This is only one point where it happens, but in the very end, we have

a |A| x |A| matrix p(A|B) (A and B are the same variables)
a 1 x |A| matrix p(B)
the result is a |A| x |A| matrix p(A,B) = p(A|B)*p(B) where we do not care about the diagonal as it is the probability of the value given itself and the part above or below the diagonal as it is duplicate of the other half. Nice for a sanity check but unnecessary after all.

Note that here it is actually not a dot product. But I guess half the computations leading to p(A|B) are also unnecessary.

Update: I will persue a more reasonable approach for this application, which is to limit A and B to be disjoint. Then all matrices are reduced in size. It can be done elegantly in numpy but adds some complexity to reading the code.

That did not make sense after all. The only option would be to create M1.shape[0]-1 submatrices that recreate the triangle, but that would certainly produce too much overhead.

IIUC you could do : `r,c = np.triu_indices(M1.shape[0])` and then : `np.einsum('ij,ij->i',M1[r], M2[c])`, but don't think it would give you any performance benefit because matrix-multiplication is just too fast and then the `tril` op isnt too heavy either. — Divakar, Dec 14 '16 at 10:07
I am currently checking an alternative approach but I think the above suggestion is generally interesting. — Radio Controlled, Dec 14 '16 at 10:09
This might be of some interest then - [`Processing upper triangular elements only with NumPy einsum`](http://stackoverflow.com/q/36599726/3293881). — Divakar, Dec 14 '16 at 10:13
I tend to agree with @Divakar's conclusion. Btw. if you time your suggestion of M1*M2 vs M1_Triangular*M2, you get no significant difference. I don't think that multiplying with zero gives you any performance improvements. It is still a normal floating point computation. — jotasi, Dec 14 '16 at 10:15
The sparse matrix forms maybe Help: https://docs.scipy.org/doc/scipy-0.18.1/reference/sparse.html but I'm not sure, if the triangular matrices are "sparse enough". — fgoettel, Dec 14 '16 at 10:29

Numpy - Matrix multiplication faster with many zero-entries?

0 Answers0