Strange execution behavior of Cosmos Gremlin query

Question

I have a below simple query which creates a new vertex and adds an edge between old vertex and new vertex in the same query. This query works well most of the times. The strange behavior kicks in when there is heavy load on the system and RUs are exhausted.

g.V('2f9d5fe8-6270-4928-8164-2580ad61e57a').AddE('likes').to(g.AddV('fruit').property('id','1').property('name','apple'))

Under Low/Normal Load the above query creates fruit vertex 1 and creates likes edge between user and fruit. Expected behavior.

Under Heavy load(available RUs are limited) the above query creates fruit vertex but doesn't create likes edge between user and fruit. Query throws 429 status code. If i try to replay the query then i get 409 since fruit vertex already exists. This behavior is corrupting the data.

In many places i have g.AddV inside the query. So all those queries might break under heavy load.

Does it make any difference if i use __.addV instead of g.AddV?

UPDATED: using __.addV doesn't make any difference.

So, is my query wrong? do i need to do upsert wherever i need to add an edge?

score 1 · Answer 1 · answered Dec 31 '19 at 02:49

I don't know how Microsoft implemented TinkerPop and thus I'm not sure if the following will help, but you could try to create the new vertex first and then add an edge to/from the existing vertex.

g.addV('fruit').
    property('id','1').
    property('name','apple').
  addE('likes').
    from(V('2f9d5fe8-6270-4928-8164-2580ad61e57a'))

If that also fails, then yes, an upsert is probably your best bet, as you can retry the same query indefinitely. However, since I have no deep knowledge of CosmosDB, I can't tell if its upserts can prevent edge duplication.

score 0 · Answer 2 · answered Jan 21 '20 at 20:58

In Cosmos DB Gremlin API, the transactional scope is limited to write operations on an entity (a Vertex or Edge). So for Gremlin requests that need to perform multiple write operations, it is possible that on failure a partial state will be committed.

Given this, it is recommended that you use idempotent gremlin traversals, such that the request can be retried on errors like RequestRateTooLarge (429) without becoming blocked by conflict errors on retry.

Here is the traversal re-written using coalesce() step so that it is idempotent (I assumed that 'name' is the partition key).

g.V('1').has('name', 'apple').fold()
  coalesce(
    __.unfold(),
    __.addV('fruit').
       property('id','1').
       property('name','apple')).
 addE('likes').
    from(V('2f9d5fe8-6270-4928-8164-2580ad61e57a'))

Note: I did not wrap the addE() in a coalesce() as it is the last operation to be perform during execution. You may want to consider doing this if there will be additional write ops after the edge in the same request, or if you need to prevent duplicate edges for concurrent add edge requests.

Strange execution behavior of Cosmos Gremlin query

2 Answers2