1

I am using the following chunk of code with networkx, when I discovered the following oddity. In the first case, I used the ufunc multiply(*) on a sparse matrix that unexpectedly correctly giving me a degree sequence. However, when the same is done with an ordinary matrix, it is giving me a 10 x 10 matrix, and as expected np.dot(...) is giving me the correct result.

import numpy as np
import networks as nx

ba = nx.barabasi_albert_graph(n=10, m=2)

A = nx.adjacency_matrix(ba)
# <10x10 sparse matrix of type '<class 'numpy.int64'>'
# with 32 stored elements in Compressed Sparse Row format>

A * np.ones(10)

# output: array([ 5.,  3.,  4.,  5.,  4.,  3.,  2.,  2.,  2.,  2.])

nx.degree(ba)

# output {0: 5, 1: 3, 2: 4, 3: 5, 4: 4, 5: 3, 6: 2, 7: 2, 8: 2, 9: 2}

B = np.ones(100).reshape(10, 10)

B * np.ones(10)

array([[ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.]])

np.dot(B, np.ones(10))
# array([ 10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.])

I was expecting that I should be doing np.dot(A, np.ones(10)) but that returns an array of 10, 10 x 10 matrices

array([ <10x10 sparse matrix of type '<class 'numpy.float64'>'
with 32 stored elements in Compressed Sparse Row format>,
   <10x10 sparse matrix of type '<class 'numpy.float64'>'
with 32 stored elements in Compressed Sparse Row format>,
   <10x10 sparse matrix of type '<class 'numpy.float64'>'
with 32 stored elements in Compressed Sparse Row format>,
   <10x10 sparse matrix of type '<class 'numpy.float64'>'
with 32 stored elements in Compressed Sparse Row format>,
   <10x10 sparse matrix of type '<class 'numpy.float64'>'
with 32 stored elements in Compressed Sparse Row format>,
   <10x10 sparse matrix of type '<class 'numpy.float64'>'
with 32 stored elements in Compressed Sparse Row format>,
   <10x10 sparse matrix of type '<class 'numpy.float64'>'
with 32 stored elements in Compressed Sparse Row format>,
   <10x10 sparse matrix of type '<class 'numpy.float64'>'
with 32 stored elements in Compressed Sparse Row format>,
   <10x10 sparse matrix of type '<class 'numpy.float64'>'
with 32 stored elements in Compressed Sparse Row format>,
   <10x10 sparse matrix of type '<class 'numpy.float64'>'
with 32 stored elements in Compressed Sparse Row format>], dtype=object)

What is the nuance here?

buzaku
  • 361
  • 1
  • 10

1 Answers1

1

For regular numpy arrays, * multiply is element by element (with broadcasting). np.dot is the matrix product, the sum-of-products. For the np.matrix subclass * is the matrix product, the dot. sparse.matrix is not a subclass, but it is modeled on that. * is the matrix product.

In [694]: A = sparse.random(10,10,.2, format='csr')
In [695]: A
Out[695]: 
<10x10 sparse matrix of type '<class 'numpy.float64'>'
    with 20 stored elements in Compressed Sparse Row format>
In [696]: A *np.ones(10)
Out[696]: 
array([ 0.6349177 ,  0.        ,  1.25781168,  1.12021258,  2.43477065,
        1.10407149,  1.95096264,  0.6253589 ,  0.44242708,  0.50353061])

The sparse matrix has the dot method, which behaves the same:

In [698]: A.dot(np.ones(10))
Out[698]: 
array([ 0.6349177 ,  0.        ,  1.25781168,  1.12021258,  2.43477065,
        1.10407149,  1.95096264,  0.6253589 ,  0.44242708,  0.50353061])

The dense version:

In [699]: np.dot(A.A,np.ones(10))
Out[699]: 
array([ 0.6349177 ,  0.        ,  1.25781168,  1.12021258,  2.43477065,
        1.10407149,  1.95096264,  0.6253589 ,  0.44242708,  0.50353061])

I thought np.dot was supposed to handle sparse matrices right, that is differ to their own method. But np.dot(A,np.ones(10)) does not do that right, producing the object array of 2 sparse matrices. I can dig into why, but for now, avoid it.

In general, use sparse functions and methods with sparse matrices. Don't assume numpy functions will have them correctly.


np.dot works fine when both arrays are sparse,

In [702]: np.dot(A,A)
Out[702]: 
<10x10 sparse matrix of type '<class 'numpy.float64'>'
    with 32 stored elements in Compressed Sparse Row format>
In [703]: np.dot(A,A.T)
Out[703]: 
<10x10 sparse matrix of type '<class 'numpy.float64'>'
    with 31 stored elements in Compressed Sparse Row format>

In [705]: np.dot(A, sparse.csr_matrix(np.ones(10)).T)
Out[705]: 
<10x1 sparse matrix of type '<class 'numpy.float64'>'
    with 9 stored elements in Compressed Sparse Row format>
In [706]: _.A
Out[706]: 
array([[ 0.6349177 ],
       [ 0.        ],
       [ 1.25781168],
       [ 1.12021258],
       [ 2.43477065],
       [ 1.10407149],
       [ 1.95096264],
       [ 0.6253589 ],
       [ 0.44242708],
       [ 0.50353061]])

For what it's worth the sparse sum is performed with this kind of matrix product:

In [708]: A.sum(axis=1)
Out[708]: 
matrix([[ 0.6349177 ],
        [ 0.        ],
        [ 1.25781168],
        [ 1.12021258],
        [ 2.43477065],
        [ 1.10407149],
        [ 1.95096264],
        [ 0.6253589 ],
        [ 0.44242708],
        [ 0.50353061]])
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • In this case, it is a blessing because the code looks much cleaner while two very different things might be happening on either sides of an expression. For instance, at one point in my code there is an expression like so: 'A * np.ones(10) > np.random.uniform(10) * np.ones(10)'. LHS and RHS are different kinds of multiplication, but the mathematics is clear! – buzaku Jul 03 '17 at 03:37