Should vertex-order matter when trying to two-color a directed graph?

Question

In The Algorithm Design Manual, the author provides an algorithm for two-coloring a graph. It's similar to the algorithm that counts the number of components, in that it iterates over all available vertices, and then colors and performs a BFS on that vertex only if it is not discovered:

for(i = 1; i <= (g->nvertices); i++) {
    if(discovered[i] == FALSE) {
        color[i] = WHITE;
        bfs(g, i);
    }
}

The BFS calls a process_edge function when y in the edge x -> y is not processed, or if the graph is directed. The BFS looks like this:

bfs(graph *g, int start) {

    queue q; //queue of vertices to visit
    int v; //current vertex
    int y; //successor vertex
    edgenode* p; //temporary pointer used to traverse adjacency list

    init_queue(&q);
    enqueue(&q, start);
    discovered[start] = TRUE;

    while(empty_queue(&q) == FALSE) {
        v = dequeue(&q);

        process_vertex_early(v); //function that handles early processing
        processed[v] = TRUE;
        p = g->edges[v];

        while(p != NULL) {
            y = p->y;

            //If the node hasn't already been processed, or if the graph is directed
            //process the edge. This part bothers me with undirected graphs
            //because how would you process an edge that was wrongly colored
            //in an earlier iteration when it will have processed[v] set to TRUE?
            if((processed[y] == FALSE) || g->directed) {
                process_edge(v, y); //coloring happens here
            }

            if(discovered[y] == FALSE) {
                enqueue(&q, y);
                discovered[y] = TRUE;
                parent[y] = v;
            }

            p = p->next;
        }

        process_vertex_late(v); //function that handles late processing
    }
}

The process_edge function looks like this:

process_edge(int x, int y) {
    if(color[x] == color[y]) {
        bipartite = FALSE;
        printf("Warning: not bipartite due to (%d, %d)\n", x, y);
    }

    color[y] = complement(color[x]);
}

Now assume we have a graph like this:

enter image description here

We can two-color it like this:

enter image description here

But if we are traversing it by vertex order, then we will initially start with node 1, and color it to WHITE. Then we will find node 13 and color it to BLACK. In the next iteration of the loop, we are looking at node 5, which is undiscovered and so we will color it WHITE and initiate a BFS on it. While doing this, we will discover a conflict between nodes 5 and 1 because 1 should be BLACK, but it was previously set to WHITE. We will then discover another conflict between 1 and 13, because 13 should be WHITE, but it was set to BLACK.

When performing a normal traversal of a graph through all components (connected or not), order will not matter since we will end up visiting all the nodes anyway, however the order seems to matter in the case of coloring the graph. I didn't see a mention of this in the book and only ran across this issue when I was trying to two-color a randomly-generated graph like the one above. I was able to make a small change to the existing algorithm, which eliminated this problem:

for(i = 1; i <= (g->nvertices); i++) {
    //Only initiate a BFS on undiscovered vertices and vertices that don't
    //have parents.
    if(discovered[i] == FALSE && parent[i] == NULL) {
        color[i] = WHITE;
        bfs(g, i);
    }
}

Does this change make sense, or is it a hack due to my not understanding some fundamental concept?

UPDATE

Based on G. Bach's answer, assume we have the following graph:

enter image description here

I'm still confused as to how this would end up being two-colored properly. With the original algorithm, the first iteration will initiate a BFS with node 1 to give us a graph that is colored like this:

enter image description here

In the next iteration, we will initiate a BFS with node 5 to give us a graph that is colored like this:

enter image description here

The next iteration will initiate a BFS with node 6 to give us a graph that is colored like this:

enter image description here

But now we won't re-color 5 because we have already visited it and so this leaves us with a graph that hasn't been colored properly.

score 1 · Accepted Answer · answered Dec 01 '13 at 02:07

The directed nature of the graph has no bearing on the bipartite coloring problem you posed, unless you define a different problem where the direction would indeed begin to matter. So you can convert the graph you used in your example into an undirected graph and run the algorithm as given in the textbook.

While the textbook does not explicitly mention that the graph should be undirected, edge direction has no bearing on the common coloring problems we study. However, you could define problems which take into account edge directions (http://www.labri.fr/perso/sopena/pmwiki/index.php?n=TheOrientedColoringPage.TheOrientedColoringPage).

Note: I intended to write this as a comment, but as a newbie, I am not allowed to do so, until I accumulate a few reputation points.

G. Bach · Answer 2 · 2013-04-08T02:23:50.770

0

Coloring a bipartite graph using BFS does not depend on vertex order. Call the two vertex sets that make up the partitions of the bipartite graph A and B; WLOG start at vertex a in A, color it WHITE; the first BFS will find the neighbors N(a), which will all be colored BLACK. For every v in N(a) with all v in B, this will then start a BFS (if I read this correctly), finding N(v) which is a subset of A again, coloring them all white and so on.

If you try to color a graph that is not bipartite using this, you will encounter a cycle of odd length (since having such a cycle as a subgraph is equivalent to not being bipartite) and the BFS will at some point encounter the start vertex of that odd cycle again and find that it already has a color, but not the one it wants to assign to it.

What I assume happens with your graph (since you do not include 5 in the BFS starting from 1) is that it is directed; I assume the algorithm you read was written for undirected graphs, because otherwise you do run into the problem you described. Also, coloring graphs is commonly defined on undirected graphs (can of course be transferred to directed ones).

Your fix will not solve the problem in general; add a new vertex 6 along with the edge 6->24 and you will run into the same problem. The BFS starting from 5 will want to color 1 black, the BFS starting from 6 will want to color it white. However, that graph would still be 2-colorable.

edited Apr 08 '13 at 02:23

answered Apr 08 '13 at 02:05

G. Bach

3,869
2
25
46

What about in the context of the algorithm described? Would it emit spurious warnings, but eventually end up with a graph that is legitimately 2-colored? – Vivin Paliath Apr 08 '13 at 02:26
Could you illustrate what you mean? I don't really understand that question. – G. Bach Apr 08 '13 at 02:34
@GBach, I'm updating the question. Also, the algorithm has been defined for both directed and undirected. At least it seems that way in the book. – Vivin Paliath Apr 08 '13 at 02:51
@VivinPaliath I suppose that portion of the book isn't available online? If there aren't too many pages, maybe you could scan them in and send them to me? I'd like to help but I don't see how that algorithm can be correct for directed graphs in general (unless bfs does something unexpected). – G. Bach Apr 08 '13 at 02:53
the only thing missing is the BFS algorithm. I will post that in a bit. – Vivin Paliath Apr 08 '13 at 02:55
@GBach, I have posted the BFS algorithm described in the book. Pretty straightforward; the only bit of confusion I have is when it decides to process the edge. Yes, unfortunately the book isn't available online. – Vivin Paliath Apr 08 '13 at 03:03
@VivinPaliath The problem regarding the `processed[y] == FALSE` is explained thus: during the BFS, vertices are only marked as `discovered`; they are only marked as `processed` once they were the source of a BFS (note that the check is whether the target(!) vertex of an edge was processed yet). The reason this is done for directed graphs by default is that only a topological order could tell you whether an edge still needs processing, and cyclic graphs don't have such an order. – G. Bach Apr 08 '13 at 03:32
What happens in `process_vertex_early(v);` and `process_vertex_late(v);`? Also, what is `y` initialized to? I don't know C properly. – G. Bach Apr 08 '13 at 03:37
@GBach, `process_vertex_early` and `process_vertex_late` are simply stubs. They don't contain any logic as yet. They are basically placeholders. `y` is set to the value of each node in the adjacency list of `v`. – Vivin Paliath Apr 08 '13 at 06:57

Should vertex-order matter when trying to two-color a directed graph?

2 Answers2