Using AWS Neptune with Gremlin query language (last version).
Using sample data inserted this way:
g.addV('test_report').property('name', 'REF').property('creationDateTime','2022-07-01 00:00:00.000000')
g.addV('test_reportrelease').property('name', 'A').property('creationDateTime','2022-07-01 01:00:00.000000')
g.addV('test_reportrelease').property('name', 'B').property('creationDateTime','2022-07-01 02:00:00.000000')
g.addV('test_reportrelease').property('name', 'C').property('creationDateTime','2022-07-01 03:00:00.000000')
g.addE('test_has').property('creationDateTime','2022-07-02 01:00:00.000000')
.from(V().hasLabel('test_report').has('name', 'REF'))
.to(V().hasLabel('test_reportrelease').has('name', 'A'))
g.addE('test_has').property('creationDateTime','2022-07-02 02:00:00.000000')
.from(V().hasLabel('test_report').has('name', 'REF'))
.to(V().hasLabel('test_reportrelease').has('name', 'B'))
g.addE('test_has').property('creationDateTime','2022-07-02 03:00:00.000000')
.from(V().hasLabel('test_report').has('name', 'REF'))
.to(V().hasLabel('test_reportrelease').has('name', 'C'))
What I want is:
- First of all, get the vertices with label "test_report"
- then follow all the next statements (union)
- then follow, if exists, all outgoing edges (outE) with the label "test_has" between vertices with label "test_report" and "test_reportrelease"; then follow all ingoing vertices (inV), and apply a constant with name "ref" and value test_has" on every edge browsed
- then follow, if exists, as well the edge "test_has" and ingoing vertices but keep only the first result according to an asc order on creationDateTime, and apply a constant with name "ref" and value "test_first" on every edge browsed
- then follow, if exists, as well the edge "test_has" but and ingoing vertices keep only the first result according to a desc order on creationDateTime, and apply a constant with name "ref" and value "test_last" on every edge browsed
- then use the tree() step to get all browsed vertices and edges as a tree
The main problem is to add the "ref" constant to every edge followed (or change the label at query time).
The sample query I wrote for most of my needs, missing the "ref" constant, is:
g.V().hasLabel('test_report')
.union(optional(
outE().hasLabel('test_has').order().by('creationDateTime').inV()),
optional(outE().hasLabel('test_has').order().by('creationDateTime').limit(1).inV()),
optional(outE().hasLabel('test_has').order().by(coalesce(values('creationDateTime'), constant('')), desc).limit(1).store('last').inV())
).valueMap(true).path()
Question: How to insert a key:value constant property on every edge or vertex traversed ?
So that the result looks like this (but formatted as a tree - path being more readable when testing):
1 path[v[70c0dcd5-a6b9-4532-28bf-85705e94697e], e[66c0dcd7-31ed-e381-5331-e8c73bb91be1][70c0dcd5-a6b9-4532-28bf-85705e94697e-test_has->c0c0dcd5-c102-4031-d739-b5bd8fe161bc], v[c0c0dcd5-c102-4031-d739-b5bd8fe161bc], {<T.id: 1>: 'c0c0dcd5-c102-4031-d739-b5bd8fe161bc', <T.label: 4>: 'test_reportrelease', 'name': ['A'], 'creationDateTime': ['2022-07-01 01:00:00.000000'], 'ref': ['test_has']}]
2 path[v[70c0dcd5-a6b9-4532-28bf-85705e94697e], e[b0c0dcd7-4d99-1f3c-077d-decd2e251c46][70c0dcd5-a6b9-4532-28bf-85705e94697e-test_has->68c0dcd5-c5c5-c869-d789-96acbb88131f], v[68c0dcd5-c5c5-c869-d789-96acbb88131f], {<T.id: 1>: '68c0dcd5-c5c5-c869-d789-96acbb88131f', <T.label: 4>: 'test_reportrelease', 'name': ['B'], 'creationDateTime': ['2022-07-01 02:00:00.000000'], 'ref': ['test_has']}]
3 path[v[70c0dcd5-a6b9-4532-28bf-85705e94697e], e[70c0dcd7-72ac-204f-1341-cc843d165a38][70c0dcd5-a6b9-4532-28bf-85705e94697e-test_has->5ac0dcd5-cb10-f392-7d80-69541c4f22eb], v[5ac0dcd5-cb10-f392-7d80-69541c4f22eb], {<T.id: 1>: '5ac0dcd5-cb10-f392-7d80-69541c4f22eb', <T.label: 4>: 'test_reportrelease', 'name': ['C'], 'creationDateTime': ['2022-07-01 03:00:00.000000'], 'ref': ['test_has']}]
4 path[v[70c0dcd5-a6b9-4532-28bf-85705e94697e], e[66c0dcd7-31ed-e381-5331-e8c73bb91be1][70c0dcd5-a6b9-4532-28bf-85705e94697e-test_has->c0c0dcd5-c102-4031-d739-b5bd8fe161bc], v[c0c0dcd5-c102-4031-d739-b5bd8fe161bc], {<T.id: 1>: 'c0c0dcd5-c102-4031-d739-b5bd8fe161bc', <T.label: 4>: 'test_reportrelease', 'name': ['A'], 'creationDateTime': ['2022-07-01 01:00:00.000000'], 'ref': ['test_first']}]
5 path[v[70c0dcd5-a6b9-4532-28bf-85705e94697e], e[70c0dcd7-72ac-204f-1341-cc843d165a38][70c0dcd5-a6b9-4532-28bf-85705e94697e-test_has->5ac0dcd5-cb10-f392-7d80-69541c4f22eb], v[5ac0dcd5-cb10-f392-7d80-69541c4f22eb], {<T.id: 1>: '5ac0dcd5-cb10-f392-7d80-69541c4f22eb', <T.label: 4>: 'test_reportrelease', 'name': ['C'], 'creationDateTime': ['2022-07-01 03:00:00.000000'], 'ref': ['test_last']}]
I have tried those approaches, but both interrupt the graph traversal when using valueMap:
g.V().hasLabel('test_report').outE().hasLabel('test_has')
.order().by('creationDateTime').limit(1).valueMap(true).unfold().inject(['ref':'test_first']).fold()
And
g.V().hasLabel('test_report').outE().hasLabel('test_has')
.order().by('creationDateTime').limit(1).union(valueMap(true).unfold(), project('ref').by(constant('test_first'))).fold()
Is there a way to achieve such thing ?
I don't want to have the results stored in the database, just need to have the values in the results.
PS: I'm querying my graph db from a Jupyter SageMaker Notebook.