I have a query that needs to get complete path based on property. There are locations and peoples. People can travel from a location to another so I want a complete map of where they started from and ended at Suppose
P1 travels from a -> b -> c -> d -> e -> f
P2 travels from c -> d -> e -> f
P3 travels from a -> b -> c -> d
P4 travels from b -> c -> a -> d -> e -> f
P5 travels from e -> f -> a -> b -> c
P6 travels from d -> e -> a -> b -> c
P7 travels from a -> c -> e -> f
Those are the path that I want from the graph. Where p1, p2 ... pn is the property in edge called name. I already came up with query but I don't know how to optimize it. Also it can't handle those people that travel from a vertex and end at same vertex I have time on every session (but maybe gremlin can't travel by previous time and same session?)
g.withSack([])
.V() // will eventually have some starting condition of about 10 unique people
.repeat(
choose(
loops().is(0),
outE().as('outgoing')
.where(
__.outV()
.inE().values('name')
.where(
eq('outgoing'))
.by()
.by(values('name')
)
.count().is(0)
)
.sack(assign)
.by(
union(
sack().unfold(),
identity().values('name')
)
.fold()
)
.filter(
sack().unfold().dedup().count().is(1)
)
.inV(),
outE()
.sack(assign)
.by(
union(
sack().unfold(),
identity().values('name')
)
.fold()
)
.filter(
sack().unfold().dedup().count().is(1)
)
.inV()
)
)
.until(
outE().filter(sack().unfold().dedup().count().is(1)).count().is(1)
)
.filter(path().unfold().count().is(gt(5)))
.path()
Right now there are several limitations of it. It gets every starting path from a provided vertex. But the query frequently runs into
{
"detailedMessage": "A timeout occurred within the script during evaluation.",
"requestId": "f34358bd-9db9-488f-be66-613a34d29f9b",
"code": "TimeLimitExceededException"
}
Or memory exception. Is there some way to optimize this query? I can't exactly replicate this in gremlify since I have about 50,000 unique sessions and each of those travel anywhere from 2 to 50 vertex.
I will eventually perform traffic analysis on it but I still can't get this to perform within default neptune time even with about 1000 limit. But I would like to get this within 10 sec at max if possible. Or 30 at the upper limit
Here's relatively simple way to replicate this graph
g.addV('place').as('1').
property(single, 'placename', 'a').
addV('place').as('2').
property(single, 'placename', 'b').
addV('place').as('3').
property(single, 'placename', 'c').
addV('place').as('4').
property(single, 'placename', 'd').
addV('place').as('5').
property(single, 'placename', 'e').
addV('place').as('6').
property(single, 'placename', 'f').
addV('place').as('7').
property(single, 'placename', 'g').
addV('place').as('8').
property(single, 'placename', 'h').
addV('place').as('9').
property(single, 'placename', 'i').
addE('person').from('1').to('2').
property('name', 'p1').addE('person').
from('2').to('3').property('name', 'p1').
addE('person').from('3').to('4').
property('name', 'p1').addE('person').
from('4').to('5').property('name', 'p1').
addE('person').from('2').to('3').
property('name', 'p2').addE('person').
from('3').to('4').property('name', 'p2').
addE('person').from('4').to('5').
property('name', 'p2').addE('person').
from('6').to('7').property('name', 'p3').
property('time', '2022-05-04 12:00:00').
addE('person').from('7').to('8').
property('name', 'p3').
property('time', '2022-05-05 12:00:00').
addE('person').from('8').to('9').
property('name', 'p3').
property('time', '2022-05-10 12:00:00').
addE('person').from('9').to('6').
property('name', 'p3').
property('time', '2022-05-03 12:00:00').
addE('person').from('5').to('6').
property('name', 'p4').addE('person').
from('6').to('7').property('name', 'p4').
addE('person').from('7').to('8').
property('name', 'p4').addE('person').
from('8').to('9').property('name', 'p4').
addE('person').from('3').to('4').
property('name', 'p5').addE('person').
from('4').to('4').property('name', 'p5').
addE('person').from('4').to('5').
property('name', 'p5').addE('person').
from('5').to('6').property('name', 'p5').
addE('person').from('6').to('7').
property('name', 'p5').addE('person').
from('1').to('2').property('name', 'p6').
addE('person').from('2').to('3').
property('name', 'p6').addE('person').
from('3').to('4').property('name', 'p6').
addE('person').from('4').to('5').
property('name', 'p6')
Also I am starting with about 5000 vertex in the graph as I have set the condition to be 5 people should start from the place to be consider a valid starting point.