Fastest implementation for All-pairs shortest paths problem?

Question

I have a weighted graph 30k nodes 160k edges, no negative weights. I would like to compute all the shortest paths from all the nodes to the others. I think I cannot assume any particular heuristics to simplify the problem.

I tried to use this Dijkstra C implementation http://compprog.wordpress.com/2007/12/01/one-source-shortest-path-dijkstras-algorithm/, that is for a single shortest path problem, calling the function dijkstras() for all my 30 nodes. As you can imagine, it takes ages. At the moment I don't have the time to write and debug the code by myself, I have to compute this paths as soon as possible and store them in a database so I am looking for another faster solution ready to use, do you have any tips?

I have to run it on a recent macbook pro with 8GB ram and I would like to find a solution that takes not more than 24 hours to finish the computation.

Thanks a lot in advance!!

Eugenio

So what do you advice? I tried this code http://compprog.wordpress.com/2007/12/01/one-source-shortest-path-dijkstras-algorithm/ and it takes more than one minute to compute all the shortest paths from one node. And this needs to be done 30k times so it's not feasible. — Eugenio, Aug 23 '11 at 18:21
Just to be sure are you compiling this code with optimization flags on and not in debug mode? — Louis Ricci, Aug 23 '11 at 19:26
It's not debug mode but I didn't use any opt. flags, I wasn't sure about the drawbacks — Eugenio, Aug 23 '11 at 22:50
@LastCoder It's not debug mode but I didn't use any opt. flags, I wasn't sure about the drawbacks, what optimization do you advice? — Eugenio, Aug 24 '11 at 01:06
Note that you don't have to perform Dijkstra's from all nodes to all other nodes - once you know the shortest path from A to B, you also know the shortest path from B back to A. You also know the shortest path from every node on that path to every other node on that path. — caf, Aug 24 '11 at 01:48

templatetypedef · Answer 1 · 2011-08-23T22:23:03.323

I looked over the Dijkstra's algorithm link that you posted in the comments and I believe that it's the source of your inefficiency. Inside the inner Dijkstra's loop, it's using an extremely unoptimized approach to determine which node to explore next (a linear scan over every node at each step). The problematic code is in two spots. The first is this code, which tries to find the next node to operate on:

mini = -1;
for (i = 1; i <= n; ++i)
    if (!visited[i] && ((mini == -1) || (d[i] < d[mini])))
         mini = i;

Because this code is nested inside of a loop that visits every node, the complexity (as mentioned in the link) is O(|V|²), where |V| is the number of nodes. In your case, since |V| is 30,000, there will be nine hundred million iterations of this loop overall. This is painfully slow (as you've seen), but there's no reason to have to do this much work.

Another trouble spot is here, which tries to find which edge in the graph should be used to reduce the cost of other nodes:

for (i = 1; i <= n; ++i)
   if (dist[mini][i])
       if (d[mini] + dist[mini][i] < d[i])
           d[i] = d[mini] + dist[mini][i];

This scans over an entire row in the adjacency matrix looking for nodes to consider, which takes time O(n) irrespective of how many outgoing edges leave the node.

While you could try fixing up this version of Dijkstra's into a more optimized implementation, I think the correct option here is just to throw this code away and find a better implementation of Dijkstra's algorithm. For example, if you use the pseudocode from the Wikipedia article implemented with a binary heap, you can get Dijkstra's algorithm running in O(|E| log |V|). In your case, this value is just over two million, which is about 450 times faster than your current approach. That's a huge difference, and I'm willing to bet that with a better Dijkstra's implementation you'll end up getting the code completing in a substantially shorter time than before.

On top of this, you might want to consider running all the Dijkstra searches in parallel, as Jacob Eggers has pointed out. This cam get you an extra speed boost for each processor that you have. Combined with the above (and more critical) fix, this should probably give you a huge performance increase.

If you plan on running this algorithm on a much denser data set (one where the number of edges approaches |V|² / log |V|), then you may want to consider switching to the Floyd-Warshall algorithm. Running Dijkstra's algorithm once per node (sometimes called Johnson's algorithm) takes time O(|V||E| log |V|) time, while using Floyd-Warshall takes O(|V|³) time. However, for the data set you've mentioned, the graph is sufficiently sparse that running multiple Dijkstra's instances should be fine.

Hope this helps!

@templatetypedef thanks for the good analysis! I tried to switch to floyd's algo using this code http://www.cs.rochester.edu/~nelson/courses/csc_173/graphs/apsp.html that is clearly O(|V|3) but it is still very very slow...around 3-4 minutes for a node (Does this timing make sense to you?), it means 3mins*30k to calculate all the paths. As I wrote, I don't have much time that's why I'm trying to avoid rewriting and debugging an implementation from scratch. I'm really surprised I haven't found any good quality Dijkstra's implementation in C on the Web! — Eugenio, Aug 24 '11 at 01:05
This is totally a shameless self-plug, but a while back I wrote an implementation of Dijkstra's algorithm with Fibonacci heaps (runs in O(E + V log V), which is quite fast). The code is online (in Java) on my personal site: http://keithschwarz.com/interesting/code/?dir=dijkstra — templatetypedef, Aug 24 '11 at 01:10
@Eugenio- As I mentioned in my answer, I would strongly suggest not using Floyd-Warshall here, since it's much less efficient than just repeatedly calling Dijkstra's algorithm. If you're pressed for time, you are probably best off spending a few hours finding/writing a good Dijkstra's algorithm and just using that, since the hours you put in reducing execution time will make it possible to get results at all, versus using a slow version that won't get you anything. — templatetypedef, Aug 24 '11 at 01:50
@ templatetypedef Thanks again; I tried anyway to optimize cs.rochester.edu/~nelson/courses/csc_173/graphs/apsp.html actually the code doesn't need 3 matrixes 30kx30k but just two (you can use just the matrix D and P, right?). After this change and a -O gcc flag setting, the code now takes less than 2 secs per node, it means that I can run all in half a day. Also, actually, in my graph the weight of a->b is always the same as b->a, can I exploit that or should be taken into consideration? — Eugenio, Aug 24 '11 at 02:25
@templatetypedef Thanks again; I tried anyway to optimize cs.rochester.edu/~nelson/courses/csc_173/graphs/apsp.html actually the code doesn't need 3 matrixes 30kx30k but just two (you can use just the matrix D and P, right?). After this change and a -O gcc flag setting, the code now takes less than 2 secs per node, it means that I can run all in half a day. Also, actually, in my graph the weight of a->b is always the same as b->a, can I exploit that or should be taken into consideration? — Eugenio, Aug 25 '11 at 17:52

score 4 · Answer 2 · answered Aug 23 '11 at 19:23

4

How about the Floyd-Warshall algorithm?

answered Aug 23 '11 at 19:23

Aaron Klotz

11,287
1
28
22

2

With a low ratio of edges to nodes, Floyd-Warshall tends to perform slower than repeated iterations of Dijkstra's algorithm. FW has complexity O(V^3), while multiple Dijkstra calls takes O(VE + V^2 log V). If E = O(V), this reduces to O(V^2 log V). Since OP's inputs are fairly large, I'd expect the multiple Dijkstra calls to run much more quickly. – templatetypedef Aug 23 '11 at 20:51

score 2 · Answer 3 · answered Aug 23 '11 at 18:24

2

Does your graph have any special structure? Is the graph planar (or nearly so)?

I'd recommend not trying to store all shortest paths, a pretty dense encoding (30k^2 "where to go next" entries) will take up 7 gigs of memory.

What is the application? Are you sure that doing a bidirectional Dijkstra (or A*, if you have a heuristic) won't be fast enough when you need to find a particular shortest path?

answered Aug 23 '11 at 18:24

Rob Neuhaus

9,190
3
28
37

I don't think it's planar or having other special structure, why do you say 7gigs? It's 900M entries right? The application is web based and need the answer immediately, for that reason I wanted to store the paths in a DB, do you think asking for a specific path is something that can be done in less than two seconds? @rrenaud – Eugenio Aug 24 '11 at 01:06
I was thinking it would cost 8 bytes per entry. But really if you have 30k, log(30k) ~= 15, so you could get away with 2 bytes per entry. – Rob Neuhaus Aug 24 '11 at 01:57

Pauli Nieminen · Answer 4 · 2014-08-23T21:46:18.900

Bottleneck can be your data structure that you use storing paths. If you use too much storage you run out of cache and memory space very soon causing fast algorithm to run very slowly because it gains order of 100 (cache miss) or 10000+ (swapped pages) constant multiplier.

Because you have to store paths in database I suspect that might be easily a bottleneck. It is probably best to try to first generate paths into memory with very efficient storage mode like N bits per vertex where N == maximum number of edges per vertex. Then set a bit to for each edge that can be used to generate one of the shortest paths. After generating this path information you can run a recursive algorithm generating the path information to a format suitable for database storage.

Of course the most likely bottleneck is still the database. You want to think very carefully what format you use to store the information because insert, search and modifying large datasets in SQL database is very slow. Also using transaction to do database operations might be able to reduce the disk write overhead if database engine manages to path multiple insertions to a single disk write operation.

It can be even better to simple store results in memory cache and discard solutions when they are not actively needed any more. Then generate same results again if you happen to need them again. That means you would generate paths only on demand when you actually need them. Runtime for 30k nodes and 160k edges should be clearly below a second for single all shortest path run of Dijkstra.

For shortest path algorithms I have always chosen C++. There shouldn't be any reason why C implementation wouldn't be simple too but C++ offers reduced coding with STL containers that can be used in initial implementation and only later implement optimized queue algorithm if benchmarks and profiling shows that there is need to have something better than STL offers.

#include <queue>
#include "vertex.h"

class vertex;
class edge;

class searchnode {
    vertex *dst;
    unsigned long dist;
    public:
    searchnode(vertex *destination, unsigned long distance) :
        dst(dst),
        dist(distance)
    {
    }

    bool operator<(const searchnode &b) const {
        /* std::priority_queue stores largest value at top */
        return dist > b.dist;
    }

    vertex *dst() const { return dst; }

    unsigned long travelDistance() const { return dist; }
};


static void dijkstra(vertex *src, vertex *dst)
{
    std::priority_queue<searchnode> queue;

    searchnode start(src, 0);
    queue.push(start);

    while (!queue.empty()) {
        searchnode cur = queue.top();
        queue.pop();

        if (cur.travelDistance() >= cur.dst()->distance())
            continue;

        cur.dst()->setDistance(cur.travelDistance());
        edge *eiter;
        for (eiter = cur.dst()->begin(); eiter != cur.dst()->end(); eiter++) {
            unsigned nextDist = cur.dist() + eiter->cost();

            if (nextDist >= eiter->otherVertex())
                continue;

            either->otherVertex()->setDistance(nextdist + 1);

            searchnode toadd(eiter->othervertex(), nextDist);
            queue.push(toadd);
        }
    }
}

Jacob Eggers · Answer 5 · 2011-08-23T20:52:37.970

0

If you can modify the algorithm to be multi-threaded you might be able to finish it in less than 24hrs.

The first node may be taking more than 1 minute. However, the 15,000th node should only take half that time because you would have calculated the shortest paths to all of the previous nodes.

edited Aug 23 '11 at 20:52

answered Aug 23 '11 at 19:41

Jacob Eggers

9,062
2
25
43

Fastest implementation for All-pairs shortest paths problem?

5 Answers5

Linked

Related