78

I have a table called products with primary key Id. I want to select all items in the table. This is the code is I'm using:

$batch_get_response = $dynamodb->batch_get_item(array(
    'RequestItems' => array(

        'products' => array(
            'Keys' => array(
                array( // Key #1
                    'HashKeyElement'  => array( AmazonDynamoDB::TYPE_NUMBER => '1'),
                    'RangeKeyElement' => array( AmazonDynamoDB::TYPE_NUMBER => $current_time),
                ),
                array( // Key #2
                    'HashKeyElement'  => array( AmazonDynamoDB::TYPE_NUMBER => '2'),
                    'RangeKeyElement' => array( AmazonDynamoDB::TYPE_NUMBER => $current_time),
                ),
            )
        )
    )   
));

Is it possible to select all items without specifying the primary key? I'm using the AWS SDK for PHP.

Steffen Opel
  • 63,899
  • 11
  • 192
  • 211
Warrior
  • 5,168
  • 12
  • 60
  • 87

9 Answers9

78

Amazon DynamoDB provides the Scan operation for this purpose, which returns one or more items and its attributes by performing a full scan of a table. Please be aware of the following two constraints:

  • Depending on your table size, you may need to use pagination to retrieve the entire result set:

    Note
    If the total number of scanned items exceeds the 1MB limit, the scan stops and results are returned to the user with a LastEvaluatedKey to continue the scan in a subsequent operation. The results also include the number of items exceeding the limit. A scan can result in no table data meeting the filter criteria.

    The result set is eventually consistent.

  • The Scan operation is potentially costly regarding both performance and consumed capacity units (i.e. price), see section Scan and Query Performance in Query and Scan in Amazon DynamoDB:

    [...] Also, as a table grows, the scan operation slows. The scan operation examines every item for the requested values, and can use up the provisioned throughput for a large table in a single operation. For quicker response times, design your tables in a way that can use the Query, Get, or BatchGetItem APIs, instead. Or, design your application to use scan operations in a way that minimizes the impact on your table's request rate. For more information, see Provisioned Throughput Guidelines in Amazon DynamoDB. [emphasis mine]

You can find more details about this operation and some example snippets in Scanning Tables Using the AWS SDK for PHP Low-Level API for Amazon DynamoDB, with the most simple example illustrating the operation being:

$dynamodb = new AmazonDynamoDB();

$scan_response = $dynamodb->scan(array(
    'TableName' => 'ProductCatalog' 
));

foreach ($scan_response->body->Items as $item)
{
    echo "<p><strong>Item Number:</strong>"
         . (string) $item->Id->{AmazonDynamoDB::TYPE_NUMBER};
    echo "<br><strong>Item Name: </strong>"
         . (string) $item->Title->{AmazonDynamoDB::TYPE_STRING} ."</p>";
}
Steffen Opel
  • 63,899
  • 11
  • 192
  • 211
  • is it possible to add condition in the query? – Warrior May 07 '12 at 06:04
  • Yes, check the _Request_ section for [Scan](http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/API_Scan.html), `ScanFilter:ComparisonOperator` provides a summary of what you can do. Depending on your scenario, you may want to look into [Query](http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/API_Query.html) as well, which is usually preferable both for performance and cost reasons (but requires a primary key), as addressed in [Query and Scan in Amazon DynamoDB](http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/QueryAndScan.html). – Steffen Opel May 07 '12 at 07:12
  • can u look on to my new question? http://stackoverflow.com/questions/10477996/writing-complex-queries-in-amazone-dynamo-dbmathematical-expressions – Warrior May 07 '12 at 07:16
  • Is it possible to get the last inserted id (MAX of primay Id)? – Warrior May 08 '12 at 10:12
  • @THOmas: No, the [Primary Key](http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/DataModel.html#DataModelPrimaryKey) is either a _Hash Type Primary Key_ or a _Hash and Range Type Primary Key_; the latter allows relative order of items (see the `RangeKeyCondition` of [Query](http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/API_Query.html), but that's about it. Accordingly (as usual for NoSQL solutions), if you really need this, you would need to model it yourself (after all, you inserted the key) - you seem to still think too much in SQL terms here ;) – Steffen Opel May 08 '12 at 10:29
  • So when i am inserting new product what should i assign to the id field ? – Warrior May 08 '12 at 11:17
  • @THOmas: That's entirely up to you, i.e. it depends on how you model your problem domain, thus it is actually unrelated to DynamoDB. – Steffen Opel May 08 '12 at 11:50
  • @THOmas: Regarding DynamoDB, I suggest you read through [Getting Started with Amazon DynamoDB](http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/GettingStartedDynamoDB.html), there are PHP samples for all steps as well. [Creating a Table](http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/LowLevelPHPCreateUpdateDeleteTable.html#sect-create-tables) shows how the usual integer based id is modeled and [Putting an Item](http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/LowLevelPHPItemCRUD.html#PutLowLevelAPIPHP) how it is used in turn. – Steffen Opel May 08 '12 at 11:50
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/11015/discussion-between-thomas-and-steffen-opel) – Warrior May 08 '12 at 11:52
22

Hi you can download using boto3. In python

import boto3
from boto3.dynamodb.conditions import Key, Attr

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Table')
response = table.scan()
items = response['Items']
while 'LastEvaluatedKey' in response:
    print(response['LastEvaluatedKey'])
    response = table.scan(ExclusiveStartKey=response['LastEvaluatedKey'])
    items.extend(response['Items'])

Igor Brejc
  • 18,714
  • 13
  • 76
  • 95
10

I figured you are using PHP but not mentioned (edited). I found this question by searching internet and since I got solution working , for those who use nodejs here is a simple solution using scan :

  var dynamoClient = new AWS.DynamoDB.DocumentClient();
  var params = {
    TableName: config.dynamoClient.tableName, // give it your table name 
    Select: "ALL_ATTRIBUTES"
  };

  dynamoClient.scan(params, function(err, data) {
    if (err) {
       console.error("Unable to read item. Error JSON:", JSON.stringify(err, null, 2));
    } else {
       console.log("GetItem succeeded:", JSON.stringify(data, null, 2));
    }
  });

I assume same code can be translated to PHP too using different AWS SDK

StefaDesign
  • 929
  • 10
  • 19
2

I fetch all items from dynamodb with the following query. It works fine. i create these function generic in zend framework and access these functions over the project.

        public function getQuerydata($tablename, $filterKey, $filterValue){
            return $this->getQuerydataWithOp($tablename, $filterKey, $filterValue, 'EQ');
        }

        public function getQuerydataWithOp($tablename, $filterKey, $filterValue, $compOperator){
        $result = $this->getClientdb()->query(array(
                'TableName'     => $tablename,
                'IndexName'     => $filterKey,
                'Select'        => 'ALL_ATTRIBUTES',
                'KeyConditions' => array(
                    $filterKey => array(
                        'AttributeValueList' => array(
                            array('S' => $filterValue)
                        ),
                'ComparisonOperator' => $compOperator
            )
            )
        ));
            return $result['Items'];
        }

       //Below i Access these functions and get data.
       $accountsimg = $this->getQuerydataWithPrimary('accounts', 'accountID',$msgdata[0]['accountID']['S']);
Roman Newaza
  • 11,405
  • 11
  • 58
  • 89
Hassan Raza
  • 671
  • 10
  • 27
  • How many records are in your database? There's a limit of 1MB, it seems like this would work great for smaller databases, but wouldn't get everything if you had some substantial data in there. – dev_row Oct 16 '15 at 15:00
  • you can use limit parameters that amazon db provide when if you get records greater than 1 MB. – Hassan Raza Oct 16 '15 at 17:45
  • From http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html#FilteringResults : "A single Query or Scan operation can retrieve a maximum of 1 MB of data. This limit applies before any filter expression is applied to the results. " – dev_row Oct 18 '15 at 17:50
  • does not get ALL items as OP wants. – Kashyap May 08 '19 at 19:06
2

A simple code to list all the Items from DynamoDB Table by specifying the region of AWS Service.

import boto3

dynamodb = boto3.resource('dynamodb', region_name='ap-south-1')
table = dynamodb.Table('puppy_store')
response = table.scan()
items = response['Items']

# Prints All the Items at once
print(items)

# Prints Items line by line
for i, j in enumerate(items):
    print(f"Num: {i} --> {j}")
1

Here is an example for Java. In withAttributesToGet you specify what exactly you want to read. Before run you have to place credential file to your .aws folder.

 public static final String TABLE_NAME = "table_name";

    public static final AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard()
            .withRegion(Regions.CA_CENTRAL_1)
            .build();

    public static void main(String[] args) throws IOException, InterruptedException {
        downloadAllRecords();
    }

    public static void downloadAllRecords() throws InterruptedException, IOException {
        final Object[] FILE_HEADER = {"key", "some_param"};
        CSVFormat csvFormat = CSVFormat.DEFAULT.withRecordSeparator("\n");
        CSVPrinter csvPrinter = new CSVPrinter(new FileWriter(TABLE_NAME + ".csv"), csvFormat);
        csvPrinter.printRecord(FILE_HEADER);

        ScanRequest scanRequest = new ScanRequest()
                .withTableName(TABLE_NAME)
                .withConsistentRead(false)
                .withLimit(100)
                .withAttributesToGet("key", "some_param");
        int counter = 0;
        do {
            ScanResult result = client.scan(scanRequest);
            Map<String, AttributeValue> lastEvaluatedKey = result.getLastEvaluatedKey();
            for (Map<String, AttributeValue> item : result.getItems()) {
                AttributeValue keyIdAttribute = item.getOrDefault("key", new AttributeValue());
                AttributeValue createdDateAttribute = item.getOrDefault("some_param", new AttributeValue());

                    counter++;
                    List record = new ArrayList();
                    record.add(keyIdAttribute.getS());
                    record.add(createdDateAttribute.getS());
                    csvPrinter.printRecord(record);
                    TimeUnit.MILLISECONDS.sleep(50);

            }
            scanRequest.setExclusiveStartKey(lastEvaluatedKey);
        } while (scanRequest.getExclusiveStartKey() != null);
        csvPrinter.flush();
        csvPrinter.close();
        System.out.println("CSV file generated successfully.");
    }

Also specify necessary dependencies.

<dependencies>
   <dependency>
       <groupId>com.sparkjava</groupId>
       <artifactId>spark-core</artifactId>
       <version>2.5.4</version>
   </dependency>
   <!-- https://mvnrepository.com/artifact/com.sparkjava/spark-template-velocity -->
   <dependency>
       <groupId>com.sparkjava</groupId>
       <artifactId>spark-template-velocity</artifactId>
       <version>2.7.1</version>
   </dependency>
   <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-logs -->
   <dependency>
       <groupId>com.amazonaws</groupId>
       <artifactId>aws-java-sdk-logs</artifactId>
       <version>1.12.132</version>
   </dependency>
   <dependency>
       <groupId>com.google.code.gson</groupId>
       <artifactId>gson</artifactId>
       <version>2.8.9</version>
   </dependency>
   <dependency>
       <groupId>com.google.guava</groupId>
       <artifactId>guava</artifactId>
       <version>31.0.1-jre</version>
   </dependency>
   <dependency>
       <groupId>org.apache.commons</groupId>
       <artifactId>commons-collections4</artifactId>
       <version>4.4</version>
   </dependency>
   <dependency>
       <groupId>com.opencsv</groupId>
       <artifactId>opencsv</artifactId>
       <version>5.3</version>
   </dependency>
   <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-dynamodb -->
   <dependency>
       <groupId>com.amazonaws</groupId>
       <artifactId>aws-java-sdk-dynamodb</artifactId>
       <version>1.12.161</version>
   </dependency>
   <dependency>
       <groupId>org.apache.commons</groupId>
       <artifactId>commons-csv</artifactId>
       <version>1.1</version>
   </dependency>

</dependencies>

Example of credential file

[default]
aws_access_key_id = AAAAAAA
aws_secret_access_key = AAAAAAAA
aws_session_token = AAAAAAA
Andriy
  • 1,981
  • 1
  • 12
  • 9
0

I'm not specifying a pk on following code:

client = boto3.client('dynamodb')
table = 'table_name'
response = client.scan(
    TableName=table,
    AttributesToGet=['second_field_in_order', 'first_field_in_order']
)
mold
  • 1,012
  • 7
  • 17
0

A method to fetch all rows without using the primary key, and without running a Scan, is to define/add a GSI (global secondary index) on the table, with an additional attribute that you can specify as the GSI key. Set this value to the same value in each row (unless you want to use it for multiple purposes)

You then do a Query on the GSI specifying the value of the added attribute.

You might have to paginate the returned values.

Using a GSI will increase the size of the table, and may increase costs.

MikeW
  • 5,504
  • 1
  • 34
  • 29
-3

This C# code is to fetch all items from a dynamodb table using BatchGet or CreateBatchGet

        string tablename = "AnyTableName"; //table whose data you want to fetch

        var BatchRead = ABCContext.Context.CreateBatchGet<ABCTable>(  

            new DynamoDBOperationConfig
            {
                OverrideTableName = tablename; 
            });

        foreach(string Id in IdList) // in case you are taking string from input
        {
            Guid objGuid = Guid.Parse(Id); //parsing string to guid
            BatchRead.AddKey(objGuid);
        }

        await BatchRead.ExecuteAsync();
        var result = BatchRead.Results;

// ABCTable is the table modal which is used to create in dynamodb & data you want to fetch