5

I'm trying to define indexing on few properties in Comos but I'm a little bit confused about automatic indexing. As per the Cosmos DB documentation:

By default, Azure Cosmos DB automatically indexes every property for all items in your container without having to define any schema or configure secondary indexes.

Also, refer to this:

In some situations, you may want to override this automatic behavior to better suit your requirements. You can customize a container's indexing policy by setting its indexing mode, and include or exclude property paths.

What I understand from the above points is that unless we define our custom indexing policy the automatic indexing is set to true (which makes sense). If however, we defined our own include and exclude paths for indexing else it should be false.

It would probably mean that if I define the container properties as below, the Indexing Policy Automatic property should be set to false on Cosmos DB.

using Microsoft.Azure.Cosmos;  //Azure Cosmos SDK v3.3.1

 .

 .

var containerProperties = new ContainerProperties
        {
            Id = "SOME_CONTAINER_NAME",
            PartitionKeyPath = "/MY_PARTITION_KEY",
        };

 containerProperties.IndexingPolicy.IncludedPaths.Add(new IncludedPath {Path = "/\"{MY_PARTITION_KEY}\"/?"});         
     
 containerProperties.IndexingPolicy.ExcludedPaths.Add(new ExcludedPath {Path = "/*"});

However, I see that with the above configuration on CosmosDb indexing being defined with automatic set to true. Indexing Policy on Azure Portal

Are Automatic property and IncludedPaths, ExcludedPaths properties in IndexingPolicy class unrelated? If so, what does automatic property means when we have defined IncludedPaths and ExcludedPaths on indexing policy?

Edit 1

It becomes a little bit more tricky and confusing. Even after setting Automatic property to false the property remains true in the portal.

That is the below code does not seem to have any effect.

containerProperties.IndexingPolicy.Automatic = false;

Edit 2

Even if I update the automatic property from the portal settings, the value does not change. And I also do not receive any error.

Community
  • 1
  • 1
Ankit Vijay
  • 3,752
  • 4
  • 30
  • 53

1 Answers1

8

I am from the CosmosDB Engineering Team. The "automatic" property and the Included/Excluded paths are unrelated.

The "automatic" property is deprecated for most containers now. It could be used to isolate a collection horizontally into two sets of documents - a set that is secondary-indexed and another set that is not, by overriding indexing directive on a per-document level. Besides a lack of concrete business value, setting the automatic property to false also caused inconsistencies in query results based on whether the query utilized the index (as opposed to a scan, for instance). So we have now deprecated the property (it cannot be set to false).

The "automatic indexing" that we generally refer to is the fact that all paths in all your documents in your container are indexed by default. This can be seen by the fact that the default indexing policy includes /* (everything under the 'root' path) in the IncludedPaths section. Hope this helps.

Krishnan Sundaram
  • 1,321
  • 9
  • 11
  • HI @Krishnan thanks your response. I understand that if we index all the properties (that is default automatic indexing) we will have impact on RU write cost. Is there any document I can refer to understand "how" much is the impact? The reason I ask this is because if there is not much impact, then may be we can go with indexing all the fields. – Ankit Vijay Nov 06 '19 at 00:33
  • 2
    Also, I think it would be good to update the documentation to explain this better. – Ankit Vijay Nov 06 '19 at 00:36
  • 1
    This is a good tool that I used earlier to calculate RUs. The online calculator does not take into account indexing policies etc. https://github.com/RicardoNiepel/AzureCosmosDB-RequestUnitsTester – Rahul Nov 06 '19 at 01:51
  • Hi Ankit, The best way to determine the charges would be to try out inserts of typical documents with (set IndexingMode = consistent) and without (IndexingMode = None) indexing on a small collection. I hesitate to provide concrete numbers since charges are dependent on the number of terms per document, which is in turn is workload dependent. – Krishnan Sundaram Nov 10 '19 at 08:05
  • The Automatic property should be deprecated in the SDK, and hidden in the container creation UI... – Thomas Levesque Jun 20 '20 at 15:42