Advice on data-structure for represeting a Path in system

Question

I have a system where i need to represent something similar as Path, a path just provides a route to reach a particular node. There can be multiple Path that can be used to reach same node.

I am currently representing a Path using vector of Nodes, I need to do operations like replaceSubpath, containsNode, containsSubPath, appendNode, getRootNode, getLeafNode (very similar operations as done for string). All of these operations can be done on vector but performance for a large path can suck.

I am looking at using boost::graph but have no experience with it, I would like to know if using boost::graph would be correct/good data structure for these and similar operations?

Any advices on using some other data structure would be helpful too, I am aware I can optimize my vector solution by keep (multi) map of node to iterator etc.

Can you provide me some advice on how to do something like findSubPath, replaceSubPath using boost graph?? — Blackhole, Jul 12 '14 at 23:19
And the reason you don't want to represent it as a single string is? — Jim Mischel, Jul 13 '14 at 13:21
@MooingDuck I have not heard of boost::path. Can link to it. The only path I find is boost::filesytem::path which I don't think would be a good fit here. — pbible, Jul 22 '14 at 12:20
@pbible: Yeah, I forgot it was in `filesystem`. I don't see why using a `boost::filesystem::path` would be worse in any way than a "vector of nodes", and it has all of these operations built in. I don't know why you think it isn't a good fit here, this is it's exact use-case. — Mooing Duck, Jul 22 '14 at 16:47
I guess it's pretty much the same as string but with delimiters. I don't see a find subpath or replace subpath. I don't think it is any better than string and it mixes semantics. Just my opinion though. — pbible, Jul 22 '14 at 17:58
I see now, the question says 'Path in system'. [boost::Filesystem::Path](http://www.boost.org/doc/libs/1_55_0/libs/filesystem/doc/reference.html) would be a good option for that. I think the OP is concerned with Graphs though. — pbible, Jul 22 '14 at 18:17
As I mentioned I can't use string (My system can have names of Node's represented as strings). The string are not unique but the Node's ptr/handle are hence I have to stick using to vector of Node* — Blackhole, Jul 22 '14 at 18:53

score 2 · Answer 1 · answered Jul 13 '14 at 17:42

2

Essentially, the class adjacency_list<> from Boost.Graph is a vector of vertices. Vertex descriptor is an integer index in this vector.

Typically, a a tree or a path (path is a special case of a tree, right?) is represented as a predecessor map (like going backward from leaves to root or from target to source). In case of integer vertex descriptors, such predecessor map is simply vector<int>. I do not think you can represent a path or a tree in a more compact way.

Of course, such vector of predecessors can be substituted into string operations, esp. those from Boost.String_Algo, http://www.boost.org/doc/libs/1_55_0/doc/html/string_algo.html

answered Jul 13 '14 at 17:42

Michael Simbirsky

3,045
1
12
24

Would you point me to some examples where these string algos were used on adjacency_list<> ?? – Blackhole Jul 14 '14 at 17:21
For example, you start with some graph algorithm which generates predecessor map, e.g. Dijkstra shortest paths (http://www.boost.org/doc/libs/1_55_0/libs/graph/example/dijkstra-example.cpp) or Kruskal MST. Then you usually build a path of interest using this predecessor map. This path is essentially in the same form of predecessor map, but it has no unrelated vertices. I assume you also have a potential subpath sp in the same form (as a predecessor map written as vector) Finally, you use boost::algorithm::find_first(sp, path) to answer questions like containsSubPath – Michael Simbirsky Jul 14 '14 at 21:39
So why should i need to use adjacency_list, can't i just keep vector and apply various string operations on it using boost string algo library. The way I construct Path currently is just use vector.push_back(node). – Blackhole Jul 15 '14 at 02:03
I think @MichaelSimbirsky gives some good advice on an implementation. How will these paths/subpaths be consumed? When you "replaceSubPath" does that need to modify the original graph? Need to know more. – pbible Jul 16 '14 at 16:33

pbible · Answer 2 · 2014-08-01T18:15:28.630

From what you describe it sounds like you are generating and editing paths in a graph, perhaps for optimizing routes etc.

I don't think that one data structure will give you what you want. I would keep the graph structure separate from the paths you are generating.

replaceSubpath: To me this would suggest a doubly linked list implementation. When you have the start and end of your path just paste it in and replace the subpath.

containsNode: Consider adding a map or set for fast containment checks.

containsSubPath: This could be tough depending on your other concerns and speed needs. If this is a very important operation consider a Suffix Tree to test sub paths quickly. Keep in mind its better if the path doesn't change much since constructing them is O(N)

appendNode: Linked list will be easy here

getRootNode: Hold a pointer to the current root node.

getLeafNode: Hold a pointer to the current leaf node.

I would make a custom data structure that can address these concerns based on your goals. Finding subpaths and replacing them quickly might be competing performance goals. Usually more search optimization = more construction overhead making them less dynamic.

score 0 · Answer 3 · answered Aug 01 '14 at 20:12

Take a look at how some other code that you admire implements the need to manage paths. For example, you might look at several implementation of Dijkstra and choose the one that looks best, most convenient or just to your taste.

IMHO it is not a good idea to model a "path" as an object, but rather think of it as a property of the nodes in a graph.

In general, I would consider 'marking' nodes that are on the path. For example, the class you use to contain the properties of the nodes might have a flag indicating true if the node is on the path and an attribute with the index of the next node on the path.

I get what you are saying about marking paths. How will this address the OP's needs of finding and replacing subpaths? Also if many paths are needed adding flags to the vertex properties would not scale. Maybe I am missing something. — pbible, Aug 08 '14 at 20:06

Advice on data-structure for represeting a Path in system

3 Answers3