How to speed up a Cosmos DB aggregate query?

Question

Our cosmos db aggregate query seems slow and costs a lot of RUs. Here are the details (plus see screenshot below): 2.4s and 3222RUs to count a result set of 414k records. Also this for just one count. Normally we would want to do a sum on many fields at once (possible only within a single partition), but performance for that is much worse.

There are 2 million records in this collection. We are using Cosmos DB w/SQL API. This particular collection is partitioned by country_code and there are 414,732 records in France ("FR") and the remainder in US. Document size is averages 917 bytes and maybe min is 800 bytes, max 1300 bytes.

Note that we have also tried a much sparser partitioning key like device_id (of which there are 2 million, 1 doc per device here) which has worse results for this query. The c.calcuated.flag1 field just represents a "state" that we want to keep a count of (we actually have 8 states that I'd like to summarize on).

The indexing on this collection is the default, which uses "consistent" index mode, and indexes all fields (and includes range indexes for Number and String). RU setting is at 20,000, and there is no other activity on the DB.

So let me know your thoughts on this. Can Cosmos DB be used reasonably to get a few sums or counts on fields without ramping up our RU charges and taking a long time? While 2.4s is not awful, we really need sub-second queries for this kind of thing. Our application (IoT based), often needs individual documents, but also sometimes needs these kinds of counts across all documents in a country.

Is there a way to improve performance?

A single index on "all fields" wouldn't be useful unless you're filtering in the order that the fields are listed in the index. Assuming your two fields here were both actually part of the record, that would mean you would need an index on `(country_code, calculated.flag1)`. I assume from the name that `calculated.flag1` is a computed value and not stored in the table. How is this being computed? Writing out the computation in your query may enable it to utilize indexes. — stevendesu, May 07 '19 at 20:16

score 3 · Accepted Answer · answered May 12 '19 at 15:01

The Cosmos DB team has now made some significant changes to aggregation performance and how indexes are used. This is their indexing "v2" strategy and was only recently rolled out (it may not be available to all accounts yet, contact MSFT if you have an older db that needs upgrading).

You can compare the new results to the picture I originally posted.

You'll note now that Document load time shows as 0ms and the retrieved document size is 0 bytes. The load time I can confirm is really quite fast now so it is possible it is under 1ms when measured from the server side. And document size of 0 makes more sense since no documents need to be retrieved for this (only count based on the index).

Finally you can see that the RUs dropped from 3222 to 7.4 !!!! A pretty drastic difference.

Summing on multiple columns at once within a single partition is also quite performant now and we can do about 8 sums at once across 2 million documents with ~50 RUs and it takes about 20-70ms when measured from a function API endpoint (so includes network time).

More work still needs to be done by Cosmos DB team to allow for cross partition multi-column aggregations, but the improvements we have now are quite promising.

Beyond the improved performance how can you tell if are on a "v2" indexing strategy? — Stephen McDowell, Aug 16 '19 at 18:45

score 0 · Answer 2 · answered May 10 '19 at 12:26

For the specific query shown, there is no need to specify table name, and you could try to limit 1, some performance will be improved. For example:

SELECT COUNT(1) FROM c WHERE country_code="FR" AND calculated.flag=1 LIMIT 1

Also, do not forget to carefully analyse your query execution, I am not sure in Cosmos, but like PostreSQL approach, EXPLAIN ANALYSE. Be also sure you are using the best type of variables, for example, varchar(2) instead of varchar(3). I would recommend to change character types of the countries per numbers, if you are filtering them (as you point out). For example, FR=1, GR=2 and so on. This will also improve performance. Finally, if country code and calculated flag are related, create a unique variable defining them. If nothing of these work, check for client performance, and even hardware.

Shahar Hadas · Answer 3 · 2019-05-11T12:11:56.977

Two ideas:

Try running the following, see if you get different run times:

SELECT COUNT(1) FROM c WHERE country_code="FR"

Important! The calculated.flag1 field, if it's not persistent, could give out the issue - as for each document/record - the DB engine has to calculate the result, hence the high RU. Can you optimize the calculated fields? (break them down, or do the calculation as part of the query?)

2nd suggestion would be to try and make you have defined a composite index

{  
        "automatic":true,
        "indexingMode":"Consistent",
        "includedPaths":[  
            {  
                "path":"/*"
            }
        ],
        "excludedPaths":[  

        ],
        "compositeIndexes":[  
            [  
                {  
                    "path":"/country_code",
                    "order":"ascending"
                },
                {  
                    "path":"/calculated",
                    "order":"descending"
                }
            ]
        ]
    }

Please also see Composite indexing policy examples

And Manage indexing policies in Azure Cosmos DB to see where you edit it

Thanks. I'll definitely check this out. However since Cosmos has now drastically improved performance (see my posted answer) I will assume that the issue was really on their side to begin with. Your answer may indeed help so I'll let you know. I do know that composite indexes are required to do multi-column "order by" a feature they also just rolled out. — Fraggle, May 12 '19 at 15:02
Can you let us know some more information on this v2 indexing strategy? How do we enable this? — Anupam Chand, Mar 23 '21 at 14:16

How to speed up a Cosmos DB aggregate query?

3 Answers3

Linked