0
g.addV('person').property('firstName','Bob').as('bob').
addV('decision').property('decision','REFER').as('brefer').select('bob').addE('hasDecision').to('brefer').
addV('phone').property('number','123').as('phone').select('bob').addE('hasPhone').to('phone').
addV('person').property('firstName','Jon').as('jon').
addV('decision').property('decision','ACCEPT').as('jaccept').select('jon').addE('hasDecision').to('jaccept').
addV('decision').property('decision','DECLINE').as('jdecline').select('jon').addE('hasDecision').to('jdecline').
addV('email').property('email','a@a.com').as('email').select('jon').addE('hasEmail').to('email').
select('jon').addE('hasPhone').to('phone').
addV('person').property('firstName','Phil').as('phil').
addV('decision').property('decision','DECLINE').as('pdecline').select('phil').addE('hasDecision').to('pdecline').
select('phil').addE('hasEmail').to('email')

In the above graph, Phil is linked to Jon by an email who in turn is linked to Bob by a phone. Each person node has decision nodes attached. I need to run a query that will return a path if Phil is linked to anyone within 4 hops who has a decision node attached with a decision of REFER. The query ignores decision nodes in its traversal.

The answer is Phil -> email (hop 1) -> Jon (hop 2) -> phone (hop 3) -> Bob (hop 4) (as Bob has a REFER decision node)

I am writing this in Gremlin on AWS Neptune. The query below should return Bob:

g.V().has('firstName','Phil').repeat(bothE().not(has(label,'hasDecision')).bothV().simplePath())
.until(out().has('decision','REFER')).path().by(valueMap()).by(label())

==>path[{firstName=[Phil]}, hasEmail, {email=[a@a.com]}, hasEmail, {firstName=[Jon]}]

It has found Bob - this can be proved by replacing REFER with X which returns nothing - but the path() step just gives up at Jon. This is a repeat() step issue it seems, which can be shown by simplifying the query replacing until() with times()

g.V().has('firstName','Phil').repeat(bothE().not(has(label,'hasDecision')).bothV().simplePath())
.times(2).path().by(valueMap()).by(label())
==>path[{firstName=[Phil]}, hasEmail, {email=[a@a.com]}, hasEmail, {firstName=[Jon]}]

g.V().has('firstName','Phil').repeat(bothE().not(has(label,'hasDecision')).bothV().simplePath())
.times(4).path().by(valueMap()).by(label())
==>path[{firstName=[Phil]}, hasEmail, {email=[a@a.com]}, hasEmail, {firstName=[Jon]}]

Note that the last query must have traversed to Bob at the end of the chain but path() has given up at Jon.

Recreating the query without the repeat gives the right path but this is no good to me as the target node is an unknown distance away

 g.V().has('firstName','Phil').
bothE().not(has(label,'hasDecision')).bothV().
bothE().not(has(label,'hasDecision')).bothV().
bothE().not(has(label,'hasDecision')).bothV().
bothE().not(has(label,'hasDecision')).bothV().
simplePath().path().by(valueMap()).by(label())
==>path[{firstName=[Phil]}, hasEmail, {email=[a@a.com]}, hasEmail, {firstName=[Jon]}, hasPhone, {number=[123]}, hasPhone, {firstName=[Bob]}]

Has anyone seen this and has a workaround? Is there an alternative to path()? The query works fine on Tinkergraph, BTW (replacing has(label(...)) with hasLabel(...))

Phil Exell
  • 11
  • 1
  • Not sure if this is relevant for you, but I've been using repeat.until query on Neptune with variable distance between nodes. It works fine for me. g.V().has('id', args.id).store('x') .repeat(inE().outV().where(without('x')).aggregate('x')) .until(has('id', args.personId)) .limit(1) .path() .by(valueMap(true)) .next() – codetiger Aug 20 '20 at 13:54

1 Answers1

1

The solution here is straightforward as it turns out. The traversal has to negotiate edges both incoming and outgoing. The bothE and bothV steps enable this but cause the traversal to go back on itself. A dedup() within the repeat, i.e.(bothE()... bothV().dedup().simplePath()) makes sure this doesn't happen.

g.V().has('firstName','Phil').repeat(bothE().not(has(label,'hasDecision')).bothV().dedup().simplePath()).until(out().has('decision','REFER')).path().by(valueMap()).by(label())

Thanks to @codetiger for bringing my attention to the edge direction

Phil Exell
  • 11
  • 1
  • 1
    if you don't want `bothE().bothV()` to traverse back on itself, then you would do better to do `bothE().otherV()`. In that way you only traverse the vertices you didn't originate from. – stephen mallette Sep 01 '20 at 22:00