I want to compute the transitive closure of a sparse matrix in Python. Currently I am using scipy sparse matrices.
The matrix power (**12
in my case) works well on very sparse matrices, no matter how large they are, but for directed not-so-sparse cases I would like to use a smarter algorithm.
I have found the Floyd-Warshall algorithm (German page has better pseudocode) in scipy.sparse.csgraph
, which does a bit more than it should: there is no function only for Warshall's algorithm - that is one thing.
The main problem is that I can pass a sparse matrix to the function, but this is utterly senseless as the function will always return a dense matrix, because what should be 0 in the transitive closure is now a path of inf
length and someone felt this needs to be stored explicitly.
So my question is: Is there any python module that allows computing the transitive closure of a sparse matrix and keeps it sparse?
I am not 100% sure that he works with the same matrices, but Gerald Penn shows impressive speed-ups in his comparison paper, which suggests that it is possible to solve the problem.
EDIT: As there were a number of confusions, I will point out the theoretical background:
I am looking for the transitive closure (not reflexive or symmetric).
I will make sure that my relation encoded in a boolean matrix has the properties that are required, i.e. symmetry or reflexivity.
I have two cases of the relation:
- reflexive
- reflexive and symmetric
I want to apply the transitive closure on those two relations. This works perfectly well with matrix power (only that in certain cases it is too expensive):
>>> reflexive
matrix([[ True, True, False, True],
[False, True, True, False],
[False, False, True, False],
[False, False, False, True]])
>>> reflexive**4
matrix([[ True, True, True, True],
[False, True, True, False],
[False, False, True, False],
[False, False, False, True]])
>>> reflexive_symmetric
matrix([[ True, True, False, True],
[ True, True, True, False],
[False, True, True, False],
[ True, False, False, True]])
>>> reflexive_symmetric**4
matrix([[ True, True, True, True],
[ True, True, True, True],
[ True, True, True, True],
[ True, True, True, True]])
So in the first case, we get all the descendents of a node (including itself) and in the second, we get all the components, that is all the nodes that are in the same component.