4

I need to find a path (requirements see below) in a directed graph, whose nodes can be categeorized into layers. These layers are important for the structure of the graph (properties of graph see below)

Example of an allowed path

Example of such a layered directed graph

Requirements of the path to be found:

  • The path must start in the first layer and end in the last layer.
  • There must be no nodes u,v of the path, which are connected by a backedge e=(u,v), (i.e. there must be no circle, whose nodes are all part of the path).

Example of an allowed path

Example of an allowed path

Allowed path: p=(1,2,5,6,8,9), because there is no backedge between any of these nodes, i.e. there is no circle in the graph, which only consists of nodes from p.

Example of a disallowed path

Example of a disallowed path

Disallowed path: p=(1,2,4,6,8,9), because there is a backedge e=(4,1), i.e. there is a circle c=(1,2,4,1), where all the nodes are part of p.

Restricting properties of the directed graph:

  • Each layer contains at least one node.
  • A directed edge e=(u,v) can only exist, if layer(v) = layer(u) + 1 (i.e. to the next layer)
  • A backedge e=(u,v) can only exist, if layer(u) >= layer(v) + 2 (i.e. at least two layers back)
  • There is always atleast one "normal" path from first to last node.
  • A special path (as described above) might not exist though.

I thought about a DFS like approach, or topological sorting (by ignoring back edge first). But I couldnt find an algorithm with feasable runtime. The Graph can get quite big (e.g. >100 nodes per layer with more than ten layers). Any ideas/hints, if an algorithm with reasonable complexity exists?

Context: This graph theory problem is a (hopefully useful) abstraction of another, more complex problem I'm trying to solve.

Edit1:

Here is the implementation, as suggested below, using python+networkx. Note, that the graph is built in such a way, that if there is a backedge e=(u,v) then always u>v applies. The user-data file is too big to post (>17000 lines).

def is_result(path, forbidden_pairs):
    for (u,v) in forbidden_pairs:
        if u in path and v in path:
            return False;
    return True;

def find_path(G, s, e):
    forbidden_pairs = set()
    for (u,v) in G.edges:
        if (u > v):
            forbidden_pairs.add((u,v))
    
    for (u,v) in forbidden_pairs:
        G.remove_edge(u,v)
    
    for path in nx.all_shortest_paths(G, s, e):
        if is_result(path, forbidden_pairs):
            return(path)
srcnuzn
  • 75
  • 5
  • 1
    I'm pretty sure there's a P-time reduction from max independent set to this problem, which would make this one NP-hard. That might not apply to the "more complex" problem, so maybe this just isn't the abstraction you're looking for. – Matt Timmermans Apr 14 '22 at 16:41
  • 2
    This isn't an answer, but in the terminology of graph theory what you're looking for is an induced path in a digraph. A subgraph is induced from a graph by selecting a number of nodes and all edges (or arc) between nodes in the selection. You want to find out if an induced path between two specific nodes exists. This might help with searching for an answer. – Dave Apr 15 '22 at 13:55

1 Answers1

2
  1. Identify all potential backedges ( e=(u,v) if layer(u) >= layer(v) + 2 ]
  2. For each potential backedge, add nodes u,v to list of "forbidden pairs"
  3. Remove potential back edges from graph
  4. Identify shortest path from source to destination in order of increasing length ( https://www.geeksforgeeks.org/find-paths-given-source-destination )
  5. Check path for forbidden pairs. If path contains none, then DONE
  6. Identify next shortest path ( loop back to step 4 )

Specification of the example graph as a space delimited text file

n 1 1
n 2 2
n 3 2
n 4 3
n 5 3
n 6 4
n 7 5
n 8 5
n 9 6
l 1 2
l 1 3
l 2 4
l 2 5
l 3 5
l 3 4
l 4 1
l 4 6
l 5 6
l 6 7
l 7 9
l 8 9
s 1
e 9

Reading this into Pathfinder ( https://github.com/JamesBremner/PathFinder ) gives

enter image description here

Here is the implementation code

void cPathFinder::srcnuzn()
{
    // backup the full graph
    auto backup = myG;

    // identify and remove potential back edges
        std::vector<std::pair<int, int>> vforbidden;
        for (auto &l : links())
        {
            if (node(source(l)).myCost - node(target(l)).myCost >= 2)
            {

                // path must not contain this pair of nodes
                vforbidden.push_back(std::make_pair(source(l), target(l)));

                // remove the edge
                removeLink(source(l), target(l));
            }
        }

    try
    {
        // recursively visit all possible paths
        visitAllPaths(
            myStart, myEnd,
            [this, &vforbidden](int pathlength)
            {
                // this "visitor" is called whenever a possible path is found
                int prev = -1;
                bool fOK = true;
                for (int i = 0; i < pathlength; i++)
                {
                    int n = myPath[i];
                    if (prev < 0)
                    {
                        prev = n;
                        continue;
                    }

                    // check of path contains any node pairs forbidden by backedges
                    for (auto &f : vforbidden)
                    {
                        if (f.first == n && f.second == prev)
                        {
                            fOK = false;
                            break;
                        }
                    }
                    if( !fOK )
                        break;
                }
                if (fOK)
                {
                    // feasible path found, mark and throw wxception to escape from recursion
                    myPath.resize(pathlength);
                    throw std::domain_error("srcnuzn_ok");
                }

                // path is not feasible, return to continue recusion to next path
                return;
            });

        // all possible paths visited witn no feasible found
        std::cout << "no feasible path\n";
        myPath.clear();
    }

    catch (std::domain_error &e)
    {
        // esception thrown, indicating a feasible path found
        std::cout << "srcnuzn_ok ";
        for (auto n : myPath)
            std::cout << userName(n) << " ";
        std::cout << "\n";
    }

    // restore full graph
    myG = backup;
}

Here is a graph with 8 layers, each with 3 nodes and 2 random back edges.

enter image description here

Running algorithm 4 times on different random graphs with these parameters takes on average 2.3 milliseconds to complete

8 layers, 3 nodes per layer, 2 back edges per layer = 2ms
10 layers, 10 nodes per layer, 4 back edges per layer = 16ms
ravenspoint
  • 19,093
  • 6
  • 57
  • 103
  • I tried the algorithm as suggested by the pseudo code, with python+networkx. Unfortunately, it did not terminate with my data, especially when such a path doesnt exist. Now, I thought about another way of abstracting my real world problem, using a colored graph instead of multiple layers. With such a colored graph, i could redescribe my graph-theory problem. And i could solve this problem using another algorithm (based on clique problem), which for my user data terminates in reasonable time (pretty fast). I will post a new topic soon, asking for improvement for my concrete implementation. – srcnuzn Apr 17 '22 at 19:08
  • You write: "the algorithm did not terminate" This does not seem possible, if you implemented it correctly. The recursive loop over possible paths must terminate when the last path is found, whether or not a feasible path is present. Try removing all possible back edges from the input and see if your implementation terminates OK. – ravenspoint Apr 17 '22 at 20:43
  • Let me correct my previous statement: "My" implementation of the pseudo code did not terminate "in time" and I had to cancel the execution. Sorry for the confusion! Currently, my programming-time is very limited due to the holidays/vacation. When I'm more available again, I will post a clean implementation + user data set, so we can reproduce this behavior and reevaluate the algorithm. – srcnuzn Apr 17 '22 at 23:06
  • I forked your repository and adapted my user data (https://github.com/srcnuzn/PathFinder/). With this data and my implementation of your pseudo-code (see above), the algorithm does not terminate in time. I really think, this abstraction of my problem is not feasable, as it leads to huge graphs. – srcnuzn Apr 18 '22 at 13:39
  • Why use your implementation? Mine is available at https://github.com/JamesBremner/PathFinder/blob/68ddfb42eb8797bcb25f48312ae551378dc69c8e/src/cPathFinder.cpp#L1142 "algorithm does not terminate in time" What does this mean? How long do you give it? What size of input are you using? Have you tried https://github.com/JamesBremner/PathFinder/blob/main/dat/srcnuzn.txt – ravenspoint Apr 18 '22 at 15:00
  • I have run your 422 node dataset. The reason it runs long is NOT because of the algorithm, but because the graphviz takes a very long time to complete generating the visual view. I will disable graphviz for large datasets – ravenspoint Apr 18 '22 at 15:25
  • When graphviz is disabled your dataset completes in less than a second of runtime. Here is the result: srcnuzn_ok 1 2 31 58 87 114 143 170 199 226 255 282 311 338 367 394 422 – ravenspoint Apr 18 '22 at 15:30
  • Commit https://github.com/JamesBremner/PathFinder/commit/2f3953d37494c5f86a912742c705dcca9029b0d4 – ravenspoint Apr 18 '22 at 15:32
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/243998/discussion-between-srcnuzn-and-ravenspoint). – srcnuzn Apr 18 '22 at 15:45