CosmosDB: ReadItemAsync takes more RU than using queryable for items with large nested collection

Question

I ran into unexpected behavior when loading items using ReadItemAsync and GetItemLinqQueryable. ReadItemAsync seems to be cheaper and faster as it does "point read" as described here (https://devblogs.microsoft.com/cosmosdb/point-reads-versus-queries) and it is always faster, but after some limit it starts to consume a lot more RU than query.

Here's an example:

public class Program
{
    public static async Task Main()
    {
        var client = await CreateClientAsync();
        var container = client.GetContainer("TestDb", "TestContainer");

        var item1Inner = await CreateAsync(container, 1);
        var item10Inner = await CreateAsync(container, 10);
        var item100Inner = await CreateAsync(container, 100);
        var item1000Inner = await CreateAsync(container, 1000);

        await TestAsync(container, item1Inner);
        await TestAsync(container, item10Inner);
        await TestAsync(container, item100Inner);
        await TestAsync(container, item1000Inner);
    }

    private static async Task TestAsync(Container container, DataTestItem item)
    {
        Console.WriteLine($"{item.SubDataItems.Count} sub items:");

        var singleItem = await container.ReadItemAsync<DataTestItem>(item.Id, new PartitionKey(item.Partition));
        Console.WriteLine($"ReadItemAsync: {singleItem.RequestCharge} RUs {singleItem.Diagnostics.GetClientElapsedTime()} elapsed");

        var requestOptions = new QueryRequestOptions {PartitionKey = new PartitionKey(item.Partition), MaxItemCount = -1};
        var singleItemViaQueryable = await container
            .GetItemLinqQueryable<DataTestItem>(requestOptions: requestOptions)
            .Where(x=> x.Id == item.Id)
            .ToFeedIterator()
            .ReadNextAsync();
        Console.WriteLine($"GetItemLinqQueryable: {singleItemViaQueryable.RequestCharge} RUs {singleItemViaQueryable.Diagnostics.GetClientElapsedTime()} elapsed");

        Console.WriteLine(
            $"Equals: {JsonConvert.SerializeObject(singleItem.Resource) == JsonConvert.SerializeObject(singleItemViaQueryable.Single())}\n\n");
    }

    private static async Task<DataTestItem> CreateAsync(Container container, int subItemsCount)
    {
        var item = new DataTestItem
        {
            Id = Guid.NewGuid().ToString(),
            Partition = Guid.NewGuid().ToString(),
            SubDataItems = Enumerable.Repeat(0, subItemsCount).Select(_ => new SubDataItem { Name = Guid.NewGuid().ToString()}).ToList()
        };

        await container.CreateItemAsync(item, new PartitionKey(item.Partition));

        return item;
    }

    public static async Task<CosmosClient> CreateClientAsync()
    {
        var connectionString = "connection-string";

        var client = new CosmosClientBuilder(connectionString).Build();
        var database = await client.CreateDatabaseIfNotExistsAsync("TestDb");
        await database.Database.CreateContainerIfNotExistsAsync("TestContainer", "/Partition");
        return client;
    }
}

public class DataTestItem
{
    [JsonProperty("id")]
    public string Id { get; set; }
    public string Partition { get; set; }
    public IList<SubDataItem> SubDataItems { get; set; }
}

public class SubDataItem
{
    public string Name { get; set; }
}

The snippet gives the following result for an item containing a list of 1000 simple inner elements:

1000 sub items:
ReadItemAsync: 4.76 RUs 00:00:00.0019103 elapsed
GetItemLinqQueryable: 3.72 RUs 00:00:00.0037312 elapsed
Equals: True

Am I missing something or am I doing something wrong? When the size of the items is larger, the difference can be tens of times. Is there a way to determine when such a difference will start? Thanks in advance!

Are you sure you're getting the same result with the query? Generally the pattern with query is to loop through feed iterator until `HasMoreResults` is false. Or use `FirstOrDefault`. See examples: https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure.cosmos.container.getitemlinqqueryable?view=azure-dotnet — Noah Stahl, Jun 17 '21 at 17:36
Yes, I set MaxItemCount to -1 to allow dynamic page size. Each test has an equality check as the last input, so the results are the same. — Ann, Jun 18 '21 at 10:29
Hmm. There is also diagnostics information included in the response. It would be interesting to compare that between both operations. — Noah Stahl, Jun 18 '21 at 11:54
I have looked through the diagnostic information for both queries and found nothing to give me an answer to my question. Do you have any ideas on what to dig into? — Ann, Jun 21 '21 at 08:56

CosmosDB: ReadItemAsync takes more RU than using queryable for items with large nested collection

0 Answers0