15

I have a large matrix that I would like to convert to sparse CSR format.

When I do:

import scipy as sp
Ks = sp.sparse.csr_matrix(A)

print Ks

Where A is dense, I get

 (0, 0) -2116689024.0
 (0, 1) 394620032.0
 (0, 2) -588142656.0
 (0, 12)    1567432448.0
 (0, 14)    -36273164.0
 (0, 24)    233332608.0
 (0, 25)    23677192.0
 (0, 26)    -315783392.0
 (0, 45)    157961968.0
 (0, 46)    173632816.0

etc...

I can get vectors of row index, column index, and value using:

Knz = Ks.nonzero()
sparserows = Knz[0]
sparsecols = Knz[1]

#The Non-Zero Value of K at each (Row,Col) 
vals = np.empty(sparserows.shape).astype(np.float)
for i in range(len(sparserows)):

    vals[i] = K[sparserows[i],sparsecols[i]]

But is it possible to extract the vectors supposedly contained in the sparse CSR format (Value, Column Index, Row Pointer)?

SciPy's documentation explains that a CSR matrix could be generated from those three vectors, but I would like to do the opposite, get those three vectors out.

What am I missing?

Thanks for the time!

Jeff
  • 180
  • 1
  • 1
  • 6

1 Answers1

22
value = Ks.data
column_index = Ks.indices
row_pointers = Ks.indptr

I believe these attributes are undocumented which may make them subject to change, but I've used them on several versions of scipy.

  • 3
    But beware. `indptr` is a special condensed format array. It is not the same as the `row` of the `coo` format. `Ks.nonzero` first converts the `csr` array to `coo` format, and returns it's `row` and `col` arrays. – hpaulj Mar 22 '18 at 21:17