I've developed a graph clustering model for assigning destinations to vehicle routes, but my implementation is very slow. It takes about two days to process a graph with 400k nodes.
My current implementation in Python is as follows:
Input data is a sparse graph:
Edges are roads
Nodes are vehicle destinations and road intersections
Create Minimum Spanning Tree using Prims Algorithm
For every edge in the MST:
Perform depth-first-search on the the two subgraphs on each side of the edge:
Sum up road lengths for each edge
If total road length for one of the subgraphs is within a defined range, then remove the edge
Any recommendations to make this implementation faster? Could using Networkx or Neo4J speed this up?