3

Let us say, We have a situation where instead of getting the total count in a table, get the count of records with a particular status. We know DynamoDb is schemaless and still has to count each record one by one to get the total count. And yet, How can we leverage the above need using dynamoDb queries?

Mukundhan
  • 3,284
  • 23
  • 36

1 Answers1

5

While normally "Query" or "Scan" requests return all the matching items, you can pass the Select=COUNT parameter and ask to retrieve only the number of matching items, instead of the actual items. But before you go doing that, there are a few things you should know:

  1. DynamoDB will still be reading - and you will still be paying for - all the data, even if just for being counted. Doing a "Scan" with a filter is in almost all cases out of the question, because it will read the entire data set every time. With a "Query" you can ask to read just one partition, or a contiguous range of sort-keys in one partition, which in some cases may be reasonable enough (but please think if it is, in your use case).

  2. Even if you're not actually reading the data, and just counting, DynamoDB still does Scan and Query with "paging", i.e., your reads request will read just 1MB of data from disk, return you the partial count, and ask you to submit another request to resume the scan. Your DynamoDB library probably has a way to automate this resumption, so for example it can run thousands or whatever number of queries needed until finally finishing the scan and calculating the total sum.

  3. In some cases, it may make sense for to maintain a counter in addition to the data. Writes will be more expensive (e.g., each write adds data and increments the counter), but reads that need this counter will be hugely cheaper - so it all depends on how much of each your workload needs.

Nadav Har'El
  • 11,785
  • 1
  • 24
  • 45