32

I want to check if two csr_matrix are equal.

If I do:

x.__eq__(y)

I get:

raise ValueError("The truth value of an array with more than one "
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

This, However, works well:

assert (z in x for z in y)

Is there a better way to do it? maybe using some scipy optimized function instead?

Thanks so much

AvidLearner
  • 4,123
  • 5
  • 35
  • 48
  • Have you tried `(x==y).all()`? Just a guess... – Christian K. Jun 06 '15 at 16:51
  • 3
    `==` and `all` are not practical calculations for sparse matrices. Most elements are 0. `x==x` would produces a matrix of all `True`, which is no longer sparse. – hpaulj Jun 07 '15 at 02:14

2 Answers2

47

Can we assume they are the same shape?

In [202]: a=sparse.csr_matrix([[0,1],[1,0]])
In [203]: b=sparse.csr_matrix([[0,1],[1,1]])
In [204]: (a!=b).nnz==0   
Out[204]: False

This checks the sparsity of the inequality array.

It will give you an efficiency warning if you try a==b (at least the 1st time you use it). That's because it has to test all those zeros. It can't take much advantage of the sparsity.

You need a relatively recent version to use logical operators like this. Were you trying to use x.__eq__(y) in some if expression, or did you get error from just that expression?

In general you probably want to check several parameters first. Same shape, same nnz, same dtype. You need to be careful with floats.

For dense arrays np.allclose is a good way of testing equality. And if the sparse arrays aren't too large, that might be good as well

np.allclose(a.A, b.A)

allclose uses all(less_equal(abs(x-y), atol + rtol * abs(y))). You can use a-b, but I suspect that this too will give an efficiecy warning.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • The best tips I could get, Thanks! – AvidLearner Jun 07 '15 at 06:14
  • 3
    I wonder why `numpy.array_equal` doesn't appear to work as expected for two sparse inputs? – ely Mar 18 '17 at 22:25
  • 3
    A scipy sparse matrix is not an `np.ndarray`. It's an entirely different object class that stores its data in arrays. Try `np.asarray(M)` for a small sample matrix. Look at the result. Also look at the code for `np.array_equal` (its Python). In general `numpy` functions don't work on sparse matrices. – hpaulj Mar 18 '17 at 22:56
1

SciPy and Numpy Hybrid Method

What worked best for my case was (using a generic code example):

bool_answer = np.arrays_equal(sparse_matrix_1.todense(), sparse_matrix_2.todense())

You might need to pay attention to the equal_nan parameter in np.arrays_equal

The following doc references helped me get there: CSR Sparse Matrix Methods CSC Sparse Matrix Methods Numpy arrays_equal method SciPy todense method

Thom Ives
  • 3,642
  • 3
  • 30
  • 29