0

We have a graph in Azure Cosmos DB (Gremlin API) with approx 3K vertices and 16K edges. I would like to drop all edges but keep the vertices.

When I run gremlin query like q.E().drop() I get the exception

ExceptionType : RequestRateTooLargeException ExceptionMessage : Message: {"Errors":["Request rate is large"]}

Current RU/s limit is 3000 RU/s

I understand the mechanism behind throwing such error. The "wait and retry" is not an option here - the limit is exceeded by a single query not by many queries, so the next time I run it after some wait period, I will also get the same exception.

The question is what options do I have to drop all the edges with as little queries as possible?

I was trying to run q.E().limit(20).drop() and it works and reports 237.62999999999994 RUs

When I run q.E().limit(2000).drop() I get the exception.

The 'g.E().limit(1).drop()' results shows varying RU cost in Azure Data Explorer:

Executed: g.E().limit(1).drop() (61.72 RUs)
Executed: g.E().limit(1).drop() (53.14 RUs)
Executed: g.E().limit(1).drop() (61.72 RUs)
Executed: g.E().limit(1).drop() (56 RUs)

But constant Request Charge : 546.38

What would be the optimal way to get rid of the edges (in terms of performance and/or in terms of the cost)

Sebastian Widz
  • 1,962
  • 4
  • 26
  • 45

1 Answers1

1

When you are running drop() query, Cosmos is actually dropping some of edges before it throws 429 "Request rate is large". So, you can retry same query

g.E().drop()

until you will get empty result (meaning that query resulted in success and all edges were dropped).

I tried out dropping edges from Azure data explorer in CosmosDB, and here is the result (provisioned 400 RU, 329 edges initially):

g.E().count() => [329]
g.E().drop() => "Request rate is large"
g.E().count() => [282]

In consecutive drop() attempts there were dropped 47, 67, 38, 75, 75 and last 27 edges. After 6 attempts I got [ ] meaning that all edges were dropped. So retrying until success might be some solution here.

m.zygmunt
  • 65
  • 1
  • 8