I have a very large DAG of strings (~200k). I would like to find the longest path that exists in this graph. The below code is how I've set up the graph (from the list of strings new_list
).
#create new empty graph
g = nx.DiGraph()
#add all words to graph
for word in new_list:
g.add_node(word)
#fill graph with valid word chains
for word in g.nodes():
for c in string.ascii_lowercase:
new_word = alphagramatize(word+c) #add char c to word in alphagram order
if(binary_search(new_list, new_word) != -1):
g.add_edge(word, new_word)
I have attempted the naive approach of checking the path distance from every node to every other node... this is clearly impractical and will not terminate.
I have also attempted to rework the longest_path()
code from this thread, to no avail.
I have read much on what I could understand about performing a topological sort and ordering of the graph, but am having trouble implementing this. Networkx provides a function topological_sort(g)
, so that work is done for me. However, how do I proceed from here now that I have a topo_sorted graph?