How is cosmosDB RU throughput enforced

Question

I have a cosmosGB gremlin API set up with 400 RU/s. If I have to run a query that needs 800 RUs, does it mean that this query takes 2 sec to execute? If i increase the throughput to 1600 RU/s, does this query execute in half a second? I am not seeing any significant changes in query performance by playing around with the RUs.

score 6 · Accepted Answer · answered Jul 21 '20 at 01:54

As I explained in a different, but somewhat related answer here, Request Units are allocated on a per-second basis. In the event a given query will cost more than the number of Request Units available in that one-second window:

The query will be executed
You will now be in "debt" by the overage in Request Units
You will be throttled until your "debt" is paid off

Let's say you had 400 RU/sec, and you executed a query that cost 800 RU. It would complete, but then you'd be in debt for around 2 seconds (400 RU per second, times two seconds). At this point, you wouldn't be throttled anymore.

The speed in which a query executes does not depend on the number of RU allocated. Whether you had 1,000 RU/second OR 100,000 RU/second, a query would run in the same amount of time (aside from any throttle time preventing the query from running initially). So, aside from throttling, your 800 RU query would run consistently, regardless of RU count.

Makes sense, thank you. So if I have batch jobs to run(more RUs), would it be a good idea to run those during off-peak hours in order to make sure customers are not throttled during regular business hours? In other words, if I am ok with some downtime in offpeak hours, can i keep my throughput at the minimum, and run the expensive ones in offpeak? — Michael Scott, Jul 21 '20 at 20:38
@MichaelScott - honestly the way you distribute traffic is up to you. However, if I were in your position, I'd likely increase my RU capacity during peak hours, and decrease in non-peak. You have complete flexibility over RU allocation - you can adjust it any time. Just consider the cost of an extra few hundred RU - it's fairly negligible, even moreso if you only raise RU for a subset of each day. — David Makogon, Jul 21 '20 at 20:57

score 0 · Answer 2 · answered Jul 21 '20 at 00:31

0

A single query is charged a given amount of request units, so it's not quite accurate to say "query needs 800 RU/s". A 1KB doc read is 1 RU, and writing is more expensive starting around 10 RU each. Generally you should avoid any requests that would individually be more than say 50, and that is probably high. In my experience, I try to keep the individual charge for each operation as low as possible, usually under 20-30 for large list queries.

The upshot is that 400/s is more than enough to at least complete 1 query. It's when you have multiple attempts that combine for overage in the timespan that Cosmos tells you to wait some time before being allowed to succeed again. This is dynamic and based on a more or less black box formula. It's not necessarily a simple division of allowance by charge, and no individual request would be faster or slower based on the limit.

You can see if you're getting throttled by inspecting the response, or monitor by checking the Azure dashboard metrics.

answered Jul 21 '20 at 00:31

Noah Stahl

6,905
5
25
36

There are many valid query scenarios that will far exceed 50 RU. It's not possible to keep queries under such a low threshold, especially with complex graph queries (e.g. traversing in an inward relationship direction cost most than an outward search, due to the way relationships are organized). Also, 400 RU is not necessarily enough to execute a single query (and aside from single queries, doesn't take into account a heavy load, executing multiple concurrent queries) – David Makogon Jul 21 '20 at 01:50
Also: document "reads" are not the same as document "queries." The example you gave, reading a 1KB document costing 1RU, is a direct point-read, executed via the SDK. That does not apply to the query engine. Also, writes can be performed under 10 RU - it's dependent on the number of indexed properties, size of document, etc. – David Makogon Jul 21 '20 at 01:58
Fair points, perhaps anything is possible. I should have clarified that I'm coming from my own context of fairly straightforward CRUD apps in the form of REST APIs, and it's definitely possible to stay far below 400 RU/s if you're careful about design. For newcomers to Cosmos, I think it's worth providing some context about when they might be "doing it wrong" -- which seems likely if your queries become that expensive right away. – Noah Stahl Jul 21 '20 at 01:59
The issue is that the OP is asking specifically about the relationship of RU to query performance. I would suggest that opinion-based guidance around RU scale and query optimizations is out-of-scope here (RU planning is far more complex than picking an arbitrary target). There's really nothing "wrong" that the OP is doing (as there is nothing inherently wrong with queries costing 800 RU, especially as we have zero context around their app or specific query needs). – David Makogon Jul 21 '20 at 02:03
Thanks for the clarification @David Makogon. My requirement is to get a full lineage diagram given a vertex, so I do need to do `repeat` in order to achieve it, thats why it costs lot of RUs. Thats almost the sole purpose of why I am modeling this solution in a graph. – Michael Scott Jul 21 '20 at 20:41

How is cosmosDB RU throughput enforced

2 Answers2

Linked

Related