0

Scenario:

I want to retrieve items from DynamoDB table which has 200k records, I am trying to get them in multiple requests

  • for first request I want 100 records only.
  • for second request I want next 100 records, here I don't want the records which are already in first request.

My implementation:

scan_kwargs=None
if scan_kwargs is None:
    scan_kwargs = {}
complete = False

while not complete:
    try:
        
        response = table.scan(Limit=10000, **scan_kwargs,
            FilterExpression=Key('timestamp').between(dateFrom, dateTo) 
                ) 
    except botocore.exceptions.ClientError as error:
        raise Exception('Error')
    next_key = response.get('LastEvaluatedKey')
    scan_kwargs['ExclusiveStartKey'] = next_key
    complete = True if next_key is None else False

    if response['Items']:
        for record in response['Items']:
            print(record)
            totalRecords = totalRecords + 1
            if totalRecords > 100:
                break
    if totalRecords > 100:
        break

From the above code I am only able to get first 100 records for multiple requests. But my requirement is to get from 101 to 200 records and ignore first 100 records

Can anyone help with working examples according to my requirement?

Vidd
  • 13
  • 4
  • If you want to get 100 items at a time, why did you set Limit=10000? Also note that Limit is an upper limit. The scan will stop when DynamoDB's processed dataset size exceeds 1 MB. – jarmod Jul 22 '22 at 21:42
  • I know that, that's the reason I added a while loop and looping with Last evaluated key.. And, limit is just a random value I gave, to retrieve as many as scan can – Vidd Jul 22 '22 at 22:00
  • Is this code supposed to be in a function that you call repeatedly to get the next 100 items? Right now it's just linear code and the while loop terminates when you've scanned the last item in the table or you hit 100 items (actually 101 because your test appears to be wrong). – jarmod Jul 22 '22 at 23:09
  • FYI related post: https://stackoverflow.com/questions/36780856/complete-scan-of-dynamodb-with-boto3 – jarmod Jul 22 '22 at 23:38
  • But, generally speaking, you should consider using [boto3 paginators](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/paginators.html). Also, see [here](https://stackoverflow.com/questions/39201093/how-to-use-boto3-pagination). – jarmod Jul 23 '22 at 00:05
  • above is not relevant to my use case, am looking for something like this pagination in case of dynamodb [link](https://www.tutorialspoint.com/book-pagination-in-python), where I can decide the pageSize. looking for predefined function to get the page-number also from AWS-DDB – Vidd Jul 24 '22 at 17:57
  • That doesn't exist. It's generally not a feature of large scale NoSQL databases afaik. More [here](https://dynobase.dev/dynamodb-pagination/). – jarmod Jul 24 '22 at 18:26

0 Answers0