-1

I'm using boto3 with python, but I belive the problem and logic should be universal across all languages.

I know that table.scan() should in theory return all the records, but in fact, they scan() results are limited to 1MB in size. It's recommended to create a while loop based on LastEvaluatedKey, but that also doesn't give me all the results (15200 instead of 16000), code here:

dynamodb = boto3.resource('dynamodb', region_name='eu-west-2')
table = dynamodb.Table(dBTable)
response = table.scan()
print("item_count:", table.item_count)
print("response1:", response["Count"])

items=[]
while 'LastEvaluatedKey' in response and response['LastEvaluatedKey'] != "":  
    print("response:", response["Count"])
    items+=response["Items"]
    response = table.scan(ExclusiveStartKey=response['LastEvaluatedKey'])

How can I fetch all the records reliably?

madej
  • 49
  • 1
  • 7

2 Answers2

2

It's recommended to create a while loop based on LastEvaluatedKey, but that also doesn't give me all the results (15200 instead of 16000).

Are you sure about that? My guess would be you have something else going on. I use boto3, and LastEvaludatedKey in a loop in production code that runs every day, and have never encountered a case where not all the rows were returned - not saying it's impossible, but I would first make sure you code is right.

Edit, this code works:

import boto3

from boto3.dynamodb.conditions import Key, Attr

dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('DeadLetterQueue')
response = table.scan()
print("item_count:", table.item_count)

items=response["Items"]
while 'LastEvaluatedKey' in response and response['LastEvaluatedKey'] != "":  
    response = table.scan(ExclusiveStartKey=response['LastEvaluatedKey'])
    items.extend(response["Items"])


print (len(items))
E.J. Brennan
  • 45,870
  • 7
  • 88
  • 116
0

The problem you are facing is not related to the DynamoDB scan operation. It's related to your code. The last scan operation is not appending to the items array.

Following is your code with slight modifications -

dynamodb = boto3.resource('dynamodb', region_name='eu-west-2')
table = dynamodb.Table(dBTable)
response = table.scan()
print("item_count:", table.item_count)
print("response1:", response["Count"])

items=response["Items"] // Changed HERE.
while 'LastEvaluatedKey' in response and response['LastEvaluatedKey'] != "":  
    response = table.scan(ExclusiveStartKey=response['LastEvaluatedKey'])
    print("response:", response["Count"])
    items+=response["Items"] // Shifted this line HERE
Sarthak Jain
  • 2,048
  • 11
  • 19