3

I have a large graph and it is represented in adjacency list. I would like to compress the graph by merging the linear chain of nodes. For example, if the edges are a-c, b-c, c-d, d-e, e-f, e-g:

a - c - d - e - f
    |       |
    b       g

Then c-d, d-e can be merged to a single node x and the new edge list should have a-x, b-x, x-g. I would like to implement it in C++, but I am wondering if there is any C++ graph library which handles this. Also, any suggestion for a efficient algorithm is appreciated.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
CPP_NEW
  • 197
  • 2
  • 9
  • This might be useful: http://stackoverflow.com/questions/27105367/finding-bridges-in-a-graph-c-boost – John Zwinck Aug 10 '16 at 03:57
  • 1
    I did't understand your example, so I took the liberty of adding ASCII art for the "problem" part. You say the solution is `a-x, b-x, x-g` but this seems clearly wrong or inconsistent to me. Can you draw the ASCII art for the solution you want? And can you explain why `c-d-e` can be merged but `a-c` cannot? – John Zwinck Aug 10 '16 at 04:03
  • Thanks @JohnZwinck for the edit!! The ASCII art was not known to me. I hope you now understood why `a-c` cannot be merged. If not, here is a simple explanation. Just think `a` and `b` are two sources and some info coming from them and passes through `c-d-e` and then the info is divided into two parts and goes to `f` and `g`. – CPP_NEW Aug 10 '16 at 16:03

2 Answers2

2

I think you example might be broken so I am going to solve a slightly different one:

a - c - i - d - e - f
    |           |
    b           g
                |
                h

I think the solution looks like:

a - c - x - e - f
    |       |
    b       h

If you agree, then consider counting the number of times each vertex appears in the adjacency list, and storing the first two neighbors for each:

a b c d e f g h i
1 1 3 2 3 1 2 1 2
c c a i d e e g c
    b e g   h   d

The places where it is 2, we can consider collapsing: at d, g, and i:

d g i # candidates
2 2 2
i e c
e h d

Now you can see g has two neighbors not in the candidates, so simply delete g because it is a singleton "chain." This leaves d, whose neighbor i is in the candidates, so collapse d and i into a new vertex x and you're done.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • 1
    What differentiates the new `x` (which has order of 2) and `g`? You can delete `g` because one of its neighbors `h` has an order of `<= 2`. If you're going with replacing instead, you may want to collapse `g` and `h` instead of *simply deleting `g`*. Or you need to go further as Lior did and collapse all nodes of order `<= 2`. – BeyelerStudios Aug 10 '16 at 09:25
  • Yes, I need to merge `g-h` also. – CPP_NEW Aug 10 '16 at 16:10
2

You simply need to remove all nodes with degree 2, merging their two neighbors into a single node.

Repeat the process till no such nodes are left.

The Boost Graph library is usually a good way to store and work with graphs. See here how to merge vertices and contract the edge.

Community
  • 1
  • 1
Lior Kogan
  • 19,919
  • 6
  • 53
  • 85
  • Actually I first implemented in C++ in this way for a big directed graph. My graph has almost 100K nodes, so I wanted to use some library which could make my life little easier. I will definitely take a look at Boost library. Thanks !! – CPP_NEW Aug 10 '16 at 16:13