Data abruptly stops writing into a csv file from Dynamo DB

Question

I have loaded a dataset to Dynamo DB successfully. I then want to read the data from the dynamo DB and load into a .csv file. Use this file by weka to develop the clusters. Unfortunately, only few data is read from the dynamo DB are loaded into the .csv file. The below is the snippet where the data is read from Dynamo DB. I have 2201 records in my Dynamo DB but it abruptly stops writing into the file at 1986 th record and in the 3 column. I have tried all possible solutions I cloud find online but was not able to solve it. Kindly someone please help me in this.

//scanning the data from dynamobb

ScanRequest scanRequest = new ScanRequest().withTableName(tablename[2]);
ScanResult result = client.scan(scanRequest);
for (Map<String, AttributeValue> item : result.getItems()){
        printItem(item,writer);
}

//appending the data into an empty CSV file

private static void printItem(Map<String, AttributeValue> attributeList,FileWriter writer) {
    int i=1;
    System.out.println("Inside printItem");
    try{
        int k=1;
    for (Map.Entry<String, AttributeValue> item : attributeList.entrySet()) {
        AttributeValue value = item.getValue();
        String valueName= value.getS();
        writer.append(valueName);
        if(k<=4){
        writer.append(',');
        }
        ++i;
        ++k;
    }
    writer.append('\n');
    ++count;
    }
    catch (IOException e) {
        e.printStackTrace();
}
}

Are you sure the full dataset is getting returned in the first place? — Marshall Tigerus, Nov 09 '14 at 06:57
I just checked. It is not reading the entire data. So the statement to read from Dynamo DB his erroneous. — kirti, Nov 09 '14 at 07:12
you should see which records are missing and use that to figure out why. — Marshall Tigerus, Nov 09 '14 at 07:47

score 0 · Answer 1 · answered Nov 09 '14 at 17:58

0

Scan is a paginated API, so you have to keep calling it repeatedly by passing in the LastEvaluatedKey as the ExclusiveStartKey. More details are in the developer guide and api docs.

The DynamoDBMapper sdk and document SDK (both ship with the aws-java-sdk) give some automatic pagination APIs so that you can just treat your table as an Iterable instead of paginating yourself. There's an example of using the low-level Java SDK like you're doing to do pagination in this section of the developer guide.

answered Nov 09 '14 at 17:58

David Yanacek

806
7
9

Also, if you're looking to export your table into CSV files, you may be interested in the EMR integration, which can export your tables to CSV files in S3, even on a schedule, using Data Pipeline: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/OtherServices.html – David Yanacek Nov 09 '14 at 18:20

Data abruptly stops writing into a csv file from Dynamo DB

1 Answers1