How to iterate non-zeroes in a sparse matrix in Chapel

Question

I have a matrix A still hanging around. It's large, sparse and new symmetric. I've created a sparse domain called spDom that contains the non-zero entries. Now, I want to iterate along row r and find the non-zero entries there, along with the index. My goal is to build another domain that is essentially row r's non-zeroes.

**RFC**: Do you indeed **have** a matrix `A`? As the large, sparse matrix problem started weeks ago, citing a python-domain representation of large sparse array ( Yes, the promised **`repr( I );repr( V )`** remains yet due to get posted there ). Here, in Chapel-domain, several rather isolated aspects of handling large-sparse arrays were also raised. I try to address in this RFC the overall context - as without any such, the isolated steps do not reflect all the costs, associated with initial data-extraction, issues with distributed-processing, with translating intermediate representation et al — user3666197, Aug 23 '17 at 19:46
As with many HPC problems, the isolated view on just one, context-less transformation-step will never allow your end-to-end processing-flow to reach any remarkable level of smart process design. ( Just recall the JSON in the middle of the process-flow & reflect the associated memory-mapping costs and nice-talkative-syntax-rich-format-re-wrapper for the small sparse-sub-set of a massive-array elements, which would have to be either xlated to dense-format ( just due to JSON-representation constraints, **exploding into ~ 2-3 x O(^2) in `[SPACE]` needs**) or ... — user3666197, Aug 23 '17 at 19:55
.. or some new, JSON-independent strategy will have to be introduced, so as to keep the [SPACE]-footprint feasible ( btw, this was the initial motivation right for introducing the sparse-representations, wasn't it ) and as the data-flows from start towards the end of the processing, each cost-of-transformation is cardinal not only in [PSPACE] ( as was objected above ) but also in [PTIME], if not [EXPTIME] domain, as it will be accumulated down the path. For this reason it is more than advisable to **not trying close one's eyes, to move "agile"-blind, but to rather work inside a full context** — user3666197, Aug 23 '17 at 20:03
Do not confuse my matrix `A` with my matrix `A`. Please, stay on topic. — Brian Dolan, Aug 23 '17 at 20:21

score 1 · Accepted Answer · answered Sep 15 '17 at 23:05

Here's an answer that will work with Chapel 1.15 as long as you're willing to store your sparse domain/array in CSR format:

First, I'll establish my (small, non-symmetric) sparse matrix for demonstration purposes:

use LayoutCS;                               // use the CSR/CSC layout module

config const n = 10;                        // declare problem size
const D = {1..n, 1..n};                     // declare dense domain                                  
var SD: sparse subdomain(D) dmapped CS();   // declare sparse subdomain                 

// populate sparse domain with some indices                                                       
SD += (1,1);
SD += (1,n/2);
SD += (2, n/4);
SD += (2, 3*n/4);
SD += (n/2, 1);
SD += (n/2, n);

var A: [SD] real;                          // declare sparse array                                  

forall (i,j) in SD do                      // initialize sparse array values                       
  A[i,j] = i + j/10.0;

My solution relies on an undocumented iterator on sparse CS* domains named dimIter() which can be used to iterate over the dimension that's stored consecutively in memory (so rows for CSR and columns for CSC). dimIter() takes two arguments: the dimension to iterate over (1=rows, 2=columns) and the index in the other dimension. Thus, to iterate over the rows that I've defined above, I could do:

for r in 1..n {
  writeln("row ", r, " contains elements at:");
  for c in SD.dimIter(2, r) do
    writeln("  column ", c, ": ", A[r,c]);
}

For the sparse matrix I show above, this yields:

row 1 contains elements at:
  column 1: 1.1
  column 5: 1.5
row 2 contains elements at:
  column 2: 2.2
  column 7: 2.7
row 3 contains elements at:
row 4 contains elements at:
row 5 contains elements at:
  column 1: 5.1
  column 10: 6.0
row 6 contains elements at:
row 7 contains elements at:
row 8 contains elements at:
row 9 contains elements at:
row 10 contains elements at:

We're interested in generalizing the dimIter() iterator and making it part of the standard sparse domain/array interface, but haven't done so yet due to (a) questions about how to generalize it to n-dimensional sparse arrays and (b) questions about whether we need to support inefficient iteration directions (e.g., should one be able to iterate over columns of CSR or rows of CSC given the expense?)

Is there a super clever way to turn this into a count of non-zeroes in the row? — Brian Dolan, Sep 18 '17 at 21:50
I did this but it seems like it could be cooler: `var t = 0; for c in A.domain.dimIter(2,v) do t = t+1;` — Brian Dolan, Sep 18 '17 at 22:02
Here's a cute but inefficient way (still iterates over the row): `for r in 1..n { var row = SD.dimIter(2,r); writeln("row ", r, " contains ", row.size, " elements"); }` I'm not sure you want to see the efficient but ugly way -- might be a good feature request GitHub issue — Brad, Sep 18 '17 at 22:02
Like Toyota, you asked for it, you got it: https://github.com/chapel-lang/chapel/issues/7374 — Brian Dolan, Sep 18 '17 at 22:16

How to iterate non-zeroes in a sparse matrix in Chapel

1 Answers1

Linked