4

This is going to be a long story, but maybe some of you would like to study this case.

I am working on parallel graph algorithm development. I've chosen a cutting-edge, HPC parallel graph data structure named STINGER. STINGER's mission statement reads:

"STINGER should provide a common abstract data structure such that the large graph community can quickly leverage each others' research developments. [...] Algorithms written for STINGER can easily be translated/ported between multiple languages and frameworks [...] It is recognized that no single data structure is optimal for every graph algorithm. The objective of STINGER is to configure a sensible data structure that can run most algorithms well. There should be no significant performance reduction for using STINGER when compared with another general data structure across a broad set of typical graph algorithms."

STINGER is probably rather efficient and suitable for shared-memory parallelism. On the other hand, it is not very abstract, general-purpose, or concise. The interface which STINGER provides is unsatisfactory for me for several reasons: It is too verbose (functions require parameters which are not important for my case); It models only a directed graph whereas I need an undirected one; and other reasons.

However, I shy away from implementing a new parallel graph data structure on my own.

So I've already started to encapsulate a STINGER instance with my own Graph class. For example, to check whether an undirected edge exists, I can now call Graph::hasEdge(node u, node v) instead of writing into my algorithms:

int to = stinger_has_typed_successor(stinger, etype, u, v);
int back = stinger_has_typed_successor(stinger, etype, v, u);
bool answer = to && back;

So far, this has worked well. Now to the topic of iteration.

STINGER realizes traversal (iteration over nodes, edges, incident edges of a node etc.) via macros. For example, you write

STINGER_PARALLEL_FORALL_EDGES_BEGIN(G.asSTINGER(), etype) {
    node u = STINGER_EDGE_SOURCE;
    node v = STINGER_EDGE_DEST;
    std::printf("found edge (%d, %d)", u, v);
} STINGER_PARALLEL_FORALL_EDGES_END();

Here STINGER_PARALLEL_FORALL_EDGES_BEGIN expands to

do {                                    \
                            \
                            \
    for(uint64_t p__ = 0; p__ < (G.asSTINGER())->ETA[(etype)].high; p__++) {    \
      struct stinger_eb *  current_eb__ = ebpool + (G.asSTINGER())->ETA[(etype)].blocks[p__]; \
      int64_t source__ = current_eb__->vertexID;            \
      int64_t type__ = current_eb__->etype;             \
      for(uint64_t i__ = 0; i__ < stinger_eb_high(current_eb__); i__++) { \
    if(!stinger_eb_is_blank(current_eb__, i__)) {                   \
      struct stinger_edge * current_edge__ = current_eb__->edges + i__;

The macro hides the intestines of the data structure which apparently need to be completely exposed for efficient (parallel) iteration. There are macros for various combinations, including STINGER_FORALL_EDGES_BEGIN, STINGER_READ_ONLY_FORALL_EDGES_BEGIN, STINGER_READ_ONLY_PARALLEL_FORALL_EDGES_BEGIN...

Yes I could use these macros, but I wonder if there is a more elegant way to implement iteration. If I could wish for an interface, it would look similar to

G.forallEdges(readonly=true, parallel=true, {..})

GraphIterTools.forallEdges(G, readonly=true, parallel=true, {...})

where {...} is simply a function, a closure or a "block of code", which would then be executed appropriately. However, I lack the C++ experience to implement this. I wonder what advice you can give me on this issue. Maybe also "You should go with the macros because...".

clstaudt
  • 21,436
  • 45
  • 156
  • 239
  • What is the reason that you use a macro? You could define a class `G` and have `forallEdges` as a static method. However, you will not be able to put the arguments as in your examples, it will become `G.forallEdges(true, true, ...)` – Zane Dec 05 '12 at 16:14
  • The macro is what already exists. I would have to build an alternative to the macros by myself. How would a static method `forallEdges` look like that could be used in places where the macro is used? – clstaudt Dec 05 '12 at 16:19
  • I think beerboy has given a good answer to this. – Zane Dec 06 '12 at 10:15

1 Answers1

2

Leveraging the existing macros, you could implement a member function on your graph class like this:

template<typename Callback>
void forallEdges(int etype, Callback callback)
{
    STINGER_PARALLEL_FORALL_EDGES_BEGIN(this->asSTINGER(), etype) {
        node u = STINGER_EDGE_SOURCE;
        node v = STINGER_EDGE_DEST;

        // call the supplied callback
        callback(u, v);

    } STINGER_PARALLEL_FORALL_EDGES_END();
}

Then define a callback function and pass it to your new method:

void my_callback(node u, node v) { ... }
...
G.forallEdges(etype, my_callback);

Or in C++11 you can use a lambda function:

G.forallEdges(etype, [](node u, node v) { ... });
beerboy
  • 1,304
  • 12
  • 12