Hey I'm thinking of using A* to find and optimal solution for the Word Ladder problem but I'm having a bit of difficulty thinking of an appropriate g(x) and h(x). For this particular problem, could g(x) be the number of hops from the start vertex and h(x) be the number of different characters from the goal word? I'm advice would be a big help.
1 Answers
I never really was a fan of the A* notation of f(x) = g(x) + h(x)
notation because it oversimplifies the algorithm. A* is based on two heuristics; often labelled g(x) + h(x)
.
You already have most of it figured out; for Djikstra's/g(x)
, you want to return the amount of hops taken. For Greedy/h(x)
, you want to check how many characters are wrong; you are at the goal when h(x) = 0
.
By combining these two values, you have the A* heuristic, which essentially says to expand the best nodes along the shortest path. In other projects, you may wish to add heuristics to A* to get behaviours like avoiding enemies (this is why I prefer not to think A* = g+h
).
EDIT: Don't forget to check each candidate using a dictionary file; word ladder requires that intermediate words be real words.

- 1,734
- 16
- 29
-
g(x) is not a heuristic, it's the real cost of the path found so far, i.e. for any node n, g(n) is the cost of the path (already explored) from the start node to n., while h(n) is the estimated or heuristic cost to reach the goal node from n. – FrankS101 Mar 09 '16 at 14:22
-
In a sense, you are correct. The reason I refer to it as a heuristic is that there is no guarantee that distance *is* the real cost of the path. Instead, we are using distance as a factor in solving our problem (by comparing solutions). Depending on the problem set, it is possible that distance won't be a consideration at all. – Aaron3468 Mar 09 '16 at 14:28