8

I am facing a weird issue on dynamoDB AWS - I am querying my table using AWS API Gateway - AWS Service Proxy and I get Count:0 results and the ScannedCount is approx 2500 records out of total 10000 records. Just to confirm I have required data in my table for which I am using Scan Operation on dynamoDB.

What I am not able to understand is why the ScannedCount is less than the complete table records. Is this suppose to happen

Abdeali Chandanwala
  • 8,449
  • 6
  • 31
  • 45

1 Answers1

12

According to the DynamoDB documentation, ScannedCount is the number of items dynamodb has looked through for current request and Count is the number of items matched your filter:

Counting the Items in the Results

In addition to the items that match your criteria, the Query response contains the following elements:

  • ScannedCount — the number of items that matched the key condition expression, before a filter expression (if present) was applied.
  • Count — the number of items that remain, after a filter expression (if present) was applied.**

Note

If you do not use a filter expression, then ScannedCount and Count will have the same value.

If the size of the Query result set is larger than 1 MB, then ScannedCount and Count will represent only a partial count of the total items. You will need to perform multiple Query operations in order to retrieve all of the results (see Paginating the Results).

Each Query response will contain the ScannedCount and Count for the items that were processed by that particular Query request. To obtain grand totals for all of the Query requests, you could keep a running tally of both ScannedCount and Count.

So in your case the scan went through first 2500 records (ScannedCount is 2500) and there are no results matching your filter (Count is zero).

To scan the rest of the data in the table, you need to repeat the request with pagination parameters as described here:

A single Scan will only return a result set that fits within the 1 MB size limit. To determine whether there are more results, and to retrieve them one page at a time, applications should do the following:

  • Examine the low-level Scan result:
    • If the result contains a LastEvaluatedKey element, proceed to step 2.
    • If there is not a LastEvaluatedKey in the result, then there are no more items to be retrieved.
  • Construct a new Scan request, with the same parameters as the previous one—but this time, take the LastEvaluatedKey value from step 1 and use it as the ExclusiveStartKey parameter in the new Scan request.
  • Run the new Scan request.
  • Go to step 1.

Depending on the language, you can find a library that does the pagination for you, like boto2 high-level dynamodb client for python or "paginator" in boto3.

Borys Serebrov
  • 15,636
  • 2
  • 38
  • 54
  • I know what you mentioned, my question is I am getting no records from the table, where as data is present in the table and I believe the Scan operation is not scanning the whole table because the ScannedCount is very less compared to the all records in the table – Abdeali Chandanwala Oct 15 '17 at 14:13
  • It's also in there - `If the size of the Query result set is larger than 1 MB, then ScannedCount and Count will represent only a partial count of the total items. You will need to perform multiple Query operations in order to retrieve all of the results (see Paginating the Results).`. So the scan went through first 2500 records (ScannedCount is 2500) and there are no results matching your filter (Count is zero). Now you need to continue the scan (see "paginating the results" as docs mention). – Borys Serebrov Oct 15 '17 at 14:24
  • Also I think I mis-read your question, I thought your concern is about 'Count: 0` and that you are assuming this is the total number of records, while actually you are asking why ScannedCount is 2500 which is less than total 10k records. I'll update the answer accordingly. – Borys Serebrov Oct 15 '17 at 14:27
  • Hi Boris - You mean to say that dynamoDB will scan 1mb of data only everytime and not more than that ? - I believe that 1mb data limit is on the results provided and not on the scan. please confirm your answer – Abdeali Chandanwala Oct 16 '17 at 05:51
  • @macmold yes, it retrieves first 1 MB of data, then filters it and returns you the first page (event if it's empty), then you repeat the request to get more data, see [the "Scan" section in docs](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Scan.html). Depending on the language, you can find a library that paginates for you, like [boto2 high-level dynamodb client](http://boto.cloudhackers.com/en/latest/ref/dynamodb2.html#boto.dynamodb2.table.Table.scan) for python or ["paginator"](https://stackoverflow.com/questions/36780856/complete-scan-of-dynamodb-with-boto3) in boto3. – Borys Serebrov Oct 16 '17 at 09:08
  • Thanx Boris - Implemented the Pagination on API Gateway – Abdeali Chandanwala Oct 16 '17 at 12:31