I am designing a system that computes best shipping route for commercial containers.
as such the path a container typically takes is:
pickup -> port of load -> port of destination -> delivery
I have composed a list of known locations from which a pickup/delivery can take place (such as cities) and a list of ports as well as the connections between those.
a sample of the data can be seen here
When looking for a route between Austin -> Frankfurt the graph should return only this path:
- Austin -> Florida -> Port of Florida -> Port of Hamburg -> Frankfurt
Austin -> NYC -> Port of NYC -> Port of London -> Port of Hamburg ->Frankfurt is ruled out because it has two international steps
the graph also returns round trips (which it should not return) for example
Austin -> Florida -> Port of Florida -> Port of Hamburg -> Berlin -> Port of Hamburg -> Frankfurt
thus far I have composed the following gremlin query
g.V(*from_vertices)
.repeat(
outE()
.has("ff_id", within(ff_id, "ANY"))
.has("quote_methods", containing(quote_method.value))
.has("valid_to", gte(current_date))
.has("valid_from", lte(current_date))
.in_v()
)
.until(hasId(within(*to_vertices)))
.path()
.as_("p")
.map(unfold().coalesce(values("international_stops"), constant(0)).sum_())
.as_("international_stops")
.filter_(select("international_stops").is_(lte(1)))
.select("p")
.map(unfold().coalesce(values("pricing_document_ids"), constant("")).fold())
.to_list()
I face two issues:
- loops in the graph, the graph contains many loops, in addition to immediate ones it also contains round trips that take an arbitrary amount of edges
- Due to memory and performance limitations I am unable to get all paths and then filter the ones containing loops