Python word game. Last letter of first word == first letter of second word. Find longest possible sequence of words

Question

I'm trying to write a program that mimics a word game where, from a given set of words, it will find the longest possible sequence of words. No word can be used twice.

I can do the matching letters and words up, and storing them into lists, but I'm having trouble getting my head around how to handle the potentially exponential number of possibilities of words in lists. If word 1 matches word 2 and then I go down that route, how do I then back up to see if words 3 or 4 match up with word one and then start their own routes, all stemming from the first word?

I was thinking some way of calling the function inside itself maybe?

I know it's nowhere near doing what I need it to do, but it's a start. Thanks in advance for any help!

g = "audino bagon baltoy banette bidoof braviary bronzor carracosta charmeleon cresselia croagunk darmanitan deino emboar emolga exeggcute gabite girafarig gulpin haxorus"

def pokemon():
    count = 1
    names = g.split()
    first = names[count]
    master = []
    for i in names:
        print (i, first, i[0], first[-1])
        if i[0] == first[-1] and i not in master:
            master.append(i)
            count += 1
            first = i
            print ("success", master)
    if len(master) == 0:
        return "Pokemon", first, "does not work"
    count += 1
    first = names[count]

pokemon()

jme · Accepted Answer · 2014-12-23T00:27:25.280

Your idea of calling a function inside of itself is a good one. We can solve this with recursion:

def get_neighbors(word, choices):
    return set(x for x in choices if x[0] == word[-1])

def longest_path_from(word, choices):
    choices = choices - set([word])
    neighbors = get_neighbors(word, choices)

    if neighbors:
        paths = (longest_path_from(w, choices) for w in neighbors)
        max_path = max(paths, key=len)
    else:
        max_path = []

    return [word] + max_path

def longest_path(choices):
    return max((longest_path_from(w, choices) for w in choices), key=len)

Now we just define our word list:

words = ("audino bagon baltoy banette bidoof braviary bronzor carracosta "
         "charmeleon cresselia croagunk darmanitan deino emboar emolga "
         "exeggcute gabite girafarig gulpin haxorus")

words = frozenset(words.split())

Call longest_path with a set of words:

>>> longest_path(words)
['girafarig', 'gabite', 'exeggcute', 'emolga', 'audino']

A couple of things to know: as you point out, this has exponential complexity, so beware! Also, know that python has a recursion limit!

score 2 · Answer 2 · edited May 23 '17 at 12:14

Using some black magic and graph theory I found a partial solution that might be good (not thoroughly tested).

The idea is to map your problem into a graph problem rather than a simple iterative problem (although it might work too!). So I defined the nodes of the graph to be the first letters and last letters of your words. I can only create edges between nodes of type first and last. I cannot map node first number X to node last number X (a word cannot be followed by it self). And from that your problem is just the same as the Longest path problem which tends to be NP-hard for general case :)

By taking some information here: stackoverflow-17985202 I managed to write this:

g = "audino bagon baltoy banette bidoof braviary bronzor carracosta charmeleon cresselia croagunk darmanitan deino emboar emolga exeggcute gabite girafarig gulpin haxorus"
words = g.split()
begin = [w[0] for w in words]  # Nodes first
end = [w[-1] for w in words]  # Nodes last

links = []
for i, l in enumerate(end):  # Construct edges
    ok = True
    offset = 0
    while ok:
        try:
            bl = begin.index(l, offset)
            if i != bl:  # Cannot map to self
                links.append((i, bl))
            offset = bl + 1  # next possible edge
        except ValueError:  # no more possible edge for this last node, Next!
            ok = False

# Great function shamelessly taken from stackoverflow (link provided above)
import networkx as nx
def longest_path(G):
    dist = {} # stores [node, distance] pair
    for node in nx.topological_sort(G):
        # pairs of dist,node for all incoming edges
        pairs = [(dist[v][0]+1,v) for v in G.pred[node]]
        if pairs:
            dist[node] = max(pairs)
        else:
            dist[node] = (0, node)
    node,(length,_)  = max(dist.items(), key=lambda x:x[1])
    path = []
    while length > 0:
        path.append(node)
        length,node = dist[node]
    return list(reversed(path))

# Construct graph
G = nx.DiGraph()
G.add_edges_from(links)
# TADAAAA!
print(longest_path(G))

Although it looks nice, there is a big drawback. You example works because there is no cycle in the resulting graph of input words, however, this solution fails on cyclic graphs. A way around that is to detect cycles and break them. Detection can be done this way:

if nx.recursive_simple_cycles(G):
    print("CYCLES!!! /o\")

Breaking the cycle can be done by just dropping a random edge in the cycle and then you will randomly find the optimal solution for your problem (imagine a cycle with a tail, you should cut the cycle on the node having 3 edges), thus I suggest brute-forcing this part by trying all possible cycle breaks, computing longest path and taking the longest of the longest path. If you have multiple cycles it becomes a bit more explosive in number of possibilities... but hey it's NP-hard, at least the way I see it and I didn't plan to solve that now :)

Hope it helps

twasbrillig · Answer 3 · 2014-12-23T01:13:13.310

Here's a solution that doesn't require recursion. It uses the itertools permutation function to look at all possible orderings of the words, and find the one with the longest length. To save time, as soon as an ordering hits a word that doesn't work, it stops checking that ordering and moves on.

>>> g = 'girafarig eudino exeggcute omolga gabite'
... p = itertools.permutations(g.split())
... longestword = ""
... for words in p:
...     thistry = words[0]
...     # Concatenates words until the next word doesn't link with this one.
...     for i in range(len(words) - 1):
...         if words[i][-1] != words[i+1][0]:
...             break
...         thistry += words[i+1]
...         i += 1
...     if len(thistry) > len(longestword):
...         longestword = thistry
...         print(longestword)
... print("Final answer is {}".format(longestword))
girafarig
girafariggabiteeudino
girafariggabiteeudinoomolga
girafariggabiteexeggcuteeudinoomolga
Final answer is girafariggabiteexeggcuteeudinoomolga

score 0 · Answer 4 · answered Dec 23 '14 at 02:57

First, let's see what the problem looks like:

from collections import defaultdict
import pydot

words = (
    "audino bagon baltoy banette bidoof braviary bronzor carracosta "
    "charmeleon cresselia croagunk darmanitan deino emboar emolga "
    "exeggcute gabite girafarig gulpin haxorus"
).split()

def main():
    # get first -> last letter transitions
    nodes = set()
    arcs = defaultdict(lambda: defaultdict(list))
    for word in words:
        first = word[0]
        last = word[-1]        
        nodes.add(first)
        nodes.add(last)
        arcs[first][last].append(word)

    # create a graph
    graph = pydot.Dot("Word_combinations", graph_type="digraph")
    # use letters as nodes
    for node in sorted(nodes):
        n = pydot.Node(node, shape="circle")
        graph.add_node(n)
    # use first-last as directed edges
    for first, sub in arcs.items():
        for last, wordlist in sub.items():
            count = len(wordlist)
            label = str(count) if count > 1 else ""
            e = pydot.Edge(first, last, label=label)
            graph.add_edge(e)

    # save result
    graph.write_jpg("g:/temp/wordgraph.png", prog="dot")

if __name__=="__main__":
    main()

results in

enter image description here

which makes the solution fairly obvious (path shown in red), but only because the graph is acyclic (with the exception of two trivial self-loops).

Python word game. Last letter of first word == first letter of second word. Find longest possible sequence of words

4 Answers4