0

I've been playing around with the forward-backward algorithm to find the most efficient (determined by a cost function dependent on how a current state differs from the next state) path to go from State 1 to State N. In the picture below, a short version of the problem can be seen with just 3 States and 2 Nodes per State. I do forward-backward algorithm on that and find the best path like normal. The red bits in the pictures are the paths checked during forward propagation bit in the code.

Now the interesting bit, I now want to find the best 3-State Length path (as before) but now only Nodes in the first State are known. The other 4 are now free-floating and can be considered to be in any State (State 2 or State 3). I want to know if you guys have a good idea of how to do this.

Picture: https://i.stack.imgur.com/TJZg8.jpg enter image description here

Note: Bear in mind the original problem consists of around 25 States and 100 Nodes per State. So, you'll know the State of around 100 Nodes in State 1 but the other 24*100 Nodes are Stateless. In this case, I want find a 25-State length path (with minimum cost).

Addendum: Someone pointed out a better algorithm would be Viterbi's algorithm. So here is a problem with more variables thrown in. Can you guys explain how would that be implemented? Same rules apply, the path should start from one of the Nodes in State 1 (Node a or Node b). Also, the cost function using the norm doesn't make sense in this case since we only have one property (Size of node) but in the actual problem I'm expecting a lot more properties.

A better picture of the problem.

syamjam
  • 13
  • 2
  • Is the exact state of nodes initially in the set of those in state 2 and 3 known after being visited? That is, does visiting a node resolve it to a certain state? If not, what is the difference between state 2 and 3? – Justin R. Jun 16 '14 at 17:55
  • How can a state be free-floating? What does that mean? How is the cost function evaluated? – Nico Schertler Jun 16 '14 at 18:01
  • The nodes, let's call them nodes a,b,c & d are free to take whatever state. The main outcome of the dynamic programming problem is a 3-state length path where state 1 would be either one of the 2 nodes known to it and states 2 and 3 can be any combinations of nodes a, b, c and d. The only restriction placed is that: 1. Nodes can't be in two states at once. 2. State 2 and 3 must be included in the 3-state length path answer. Free-floating means any node can be in any state (2 and 3). The cost function is the norm of the properties of the nodes in the previous state - the current state. – syamjam Jun 16 '14 at 18:03
  • I have the feeling that this problem can be modeled differently. Can you elaborate your actual problem (what do you need this for?) – Nico Schertler Jun 16 '14 at 18:05
  • I don't see how the forward-backward algorithm solves this. Isn't finding the optimal path the Viterbi algorithm's job? – Fred Foo Jun 16 '14 at 18:07
  • I see, I think assigning states 2 and 3 make this very confusing. I'll edit the main question in a bit. – syamjam Jun 16 '14 at 18:12
  • So if we interpret the nodes' properties as coordinates, then you want to find the path with a constant number of steps and minimal length starting at a node which lies in a given set of start nodes. Right? Btw, the solution of your small example is b-d-e, right? – Nico Schertler Jun 16 '14 at 18:45
  • Indeed, one of the results is b-d-e. But it can also be b-e-d. Coordinate distance between nodes are not considered in this case. – syamjam Jun 16 '14 at 18:55
  • But doesn't b-e-d have a cost of 3+3=6, whereas b-d-e has 0+3=3? – Nico Schertler Jun 16 '14 at 18:56
  • Ah, that is true. I miscalculated. Mea culpa. – syamjam Jun 16 '14 at 18:57
  • Seems to be a good fit for the forward-backward algorithm. All you have to do is propagate the set of visited nodes (not only the total cost and predecessor). Then only examine the edges that do not lead to a node that has already been visited. – Nico Schertler Jun 16 '14 at 19:00
  • I was thinking of doing that but for a larger number of nodes, wouldn't the problem be excruciatingly slow to solve? Do you mind writing what you have in mind for the forward-propagation algorithm with visited nodes propagation using the above example (oh, and post it as an answer)? – syamjam Jun 16 '14 at 19:06

1 Answers1

1

A variation of Dijkstra's algorithm might be faster for your problem than the forward-backward algorithm, because it does not analyze all nodes at once. Dijkstra is a DP algorithm after all.

Let a node be specified by

Node:
   Predecessor : Node
   Total cost : Number      
   Visited nodes : Set of nodes  (e.g. a hash set or other performant set)

Initialize the algorithm with

open set : ordered (by total cost) set of nodes = set of possible start nodes (set visitedNodes to the one-element set with the current node)
           ( = {a, b} in your example)

Then execute the algorithm:

do
    n := pop element from open set
    if(n.visitedNodes.count == stepTarget)
        we're done, backtrace the path from this node
    else
        for each n2 in available nodes
            if not n2 in n.visitedNodes
                push copy of n2 to open set (the same node might appear multiple times in the set):
                    .cost := n.totalCost + norm(n2 - n)
                    .visitedNodes := n.visitedNodes u { n2 }  //u = set union
                    .predecessor := n                    
        next
loop

If calculating the norm is expensive, you might want to calculate it on demand and store it in a map.

Nico Schertler
  • 32,049
  • 4
  • 39
  • 70
  • I'm not familiar with sets in programming, is that by any chance similar to a list or a C# stack? – syamjam Jun 16 '14 at 19:50
  • In C# you can use a `HashSet`. It's a collection (just like `List` or `Stack`) that is optimized for finding if an element is present. For the open set you can use `SortedSet`. – Nico Schertler Jun 16 '14 at 19:53