1

I am using neptune's graph database with gremlin queries through python, to store addresses in a database. Most of the queries execute fine, but once i try the following query neptune returns a internal failure exception:

g.V(address).outE('isPartOf').inV().
  dedup().as_('groupNode').
  inE('isPartOf').outV().dedup().as_('children').
  addE('isPartOf').to(group).
  select('groupNode').drop().
  fold().
  coalesce(__.unfold(), 
           g.V(address).addE('isPartOf').to(group)).next()

Every address has the possibility to belong to a group. when the address is already assigned to a group, i try to take all addresses assigned to that group and assign them to a new group, while deleting the old group. If the address is not yet assigned to a group i simply want to assign the address to the new group immediately.

If i try this query on it's own everything executes perfectly (although it is a bit of a slow query). However once i try to execute this query in parallel on more addresse this query fails with the following error:

Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 804, in __bootstrap_inner
    self.run()
  File "gremlinExample.py", line 30, in run
    processTx(self.tx, self.g, self.parentBlock)
  File "gremlinExample.py", line 152, in processTx
    g.V(address).outE('isPartOf').inV().dedup().as_('groupNode').inE('isPartOf').outV().dedup().as_('children').select('children').addE('isPartOf').to(group).select('groupNode').drop().fold().coalesce(__.unfold(), g.V(address).addE('isPartOf').to(group)).next()
  File "/home/ec2-user/.local/lib/python2.7/site-packages/gremlin_python/process/traversal.py", line 70, in next
    return self.__next__()
  File "/home/ec2-user/.local/lib/python2.7/site-packages/gremlin_python/process/traversal.py", line 43, in __next__
    self.traversal_strategies.apply_strategies(self)
  File "/home/ec2-user/.local/lib/python2.7/site-packages/gremlin_python/process/traversal.py", line 346, in apply_strategies
    traversal_strategy.apply(traversal)
  File "/home/ec2-user/.local/lib/python2.7/site-packages/gremlin_python/driver/remote_connection.py", line 143, in apply
    remote_traversal = self.remote_connection.submit(traversal.bytecode)
  File "/home/ec2-user/.local/lib/python2.7/site-packages/gremlin_python/driver/driver_remote_connection.py", line 54, in submit
    results = result_set.all().result()
  File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 405, in result
    return self.__get_result()
  File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 357, in __get_result
    raise type(self._exception), self._exception, self._traceback
GremlinServerError: 500: {"requestId":"a42015b7-6b22-4bd1-9e7d-e3252e8f3ab6","code":"InternalFailureException","detailedMessage":"Can not get the attachable from the host vertex: v[64b32957-ef71-be47-c8d7-0109cfc4d9fd]-/->neptunegraph[org.apache.commons.configuration.PropertiesConfiguration@6db0f02]"}

To my knowledge execution in parallel shouldn't be the problem, since every query simply get's queued at the database (exactly for this reason i tried to create a query which executes the whole task at once).

Excuses for any bad English, it is not my native language

stephen mallette
  • 45,298
  • 5
  • 67
  • 135
J v Vuuren
  • 11
  • 2
  • 1
    Apparently the query failed on the fact that i called next() instead of iterate(). Although the error message seems random, replacing next by iterate fixed the issue for now. – J v Vuuren Oct 08 '18 at 09:32
  • two things: (1) the error is interesting - could you please include the code the code that is making the requests in parallel? (2) i don't think that your traversal is doing what you describe - please include a Gremlin script that creates some sample data like the example shown here: https://stackoverflow.com/questions/51388315/gremlin-choose-one-item-at-random – stephen mallette Oct 09 '18 at 10:26
  • @JvVuuren Do you have any updates on this one? Perhaps capture your work around as an answer for now while you get more details as requested by Stephen? – The-Big-K Dec 27 '18 at 21:41
  • Eventually we ended up using Neo4j. As the neptune database kept giving back unexplainable errors. It might have been something related to our setup, but i can not supply any more information, as we are no longer developing on this project. – J v Vuuren Jan 21 '19 at 08:08
  • @JvVuuren Do you mind accepting the answer so that this question can be closed out? – The-Big-K Aug 29 '19 at 01:19

1 Answers1

0

For anyone else who's looking for an update here - the OP was able to resolve the issue by replacing .next() with a .iterate(). Some followups were needed to understand the query and data better, but the OP has abandoned the project and moved to another solution.

The-Big-K
  • 2,672
  • 16
  • 35