14

I would like to implement a DynamoDB Scan with the following logic:

Scanning -> Filtering(boolean true or false) -> Limiting(for pagination)

However, I have only been able to implement a Scan with this logic:

Scanning -> Limiting(for pagination) -> Filtering(boolean true or false)

How can I achieve this?

Below is an example I have written that implements the second Scan logic:

    var parameters = {
        TableName: this.tableName,
        Limit: queryStatement.limit
    };
    if ('role' in queryStatement) {
        parameters.FilterExpression = '#role = :role';
        parameters.ExpressionAttributeNames = {
            '#role': 'role'
        };
        parameters.ExpressionAttributeValues = {
            ':role': queryStatement.role
        };
    }
    if ('startKey' in queryStatement) {
        parameters.ExclusiveStartKey = { id: queryStatement.startKey};
    }

    this.documentClient.scan(parameters, (errorResult, result) => {
        if (errorResult) {
            errorResult._status = 500;
            return reject(errorResult);
        }

        return resolve(result);
    });

This codes works like second one.

Scanning -> Limiting -> Filtering

F_SO_K
  • 13,640
  • 5
  • 54
  • 83
Wooyoung Tyler Kim
  • 484
  • 2
  • 9
  • 27
  • Can you please tell me how you fixed the issue with GSI? – Eric May 15 '18 at 17:30
  • 5
    Oh, I made GSI which always return filtered values without filter. So, SCANNING -> LIMITING -> FILTERING is changing GSI SCANNING(SAME EFFECT WITH FILTER) -> LIMITING. GSI can do anything, but be careful, GSI data can not update synchronously. So If you handle realtime data, this will be not a proper answer. – Wooyoung Tyler Kim Jul 08 '18 at 15:57
  • @Wooyoung Tyler Kim It would be great if you post your final solution as an answer here. Looks like lot of people are curious to see. – Gem May 17 '21 at 12:43
  • @Gem Yeah, Original source is in company so, I write this part separately as soon as I can. – Wooyoung Tyler Kim Jun 07 '21 at 13:56

6 Answers6

12

The DynamoDB LIMIT works as mentioned below (i.e. second approach in your post) by design. As it works by design, there is no solution for this.

LastEvaluatedKey should be used to get the data on subsequent scans.

Scanning -> Limiting(for pagination) -> Filtering(boolean true or false)

In a request, set the Limit parameter to the number of items that you want DynamoDB to process before returning results.

In a response, DynamoDB returns all the matching results within the scope of the Limit value. For example, if you issue a Query or a Scan request with a Limit value of 6 and without a filter expression, DynamoDB returns the first six items in the table that match the specified key conditions in the request (or just the first six items in the case of a Scan with no filter). If you also supply a FilterExpression value, DynamoDB will return the items in the first six that also match the filter requirements (the number of results returned will be less than or equal to 6).

For either a Query or Scan operation, DynamoDB might return a LastEvaluatedKey value if the operation did not return all matching items in the table. To get the full count of items that match, take the LastEvaluatedKey value from the previous request and use it as the ExclusiveStartKey value in the next request. Repeat this until DynamoDB no longer returns a LastEvaluatedKey value.

notionquest
  • 37,595
  • 6
  • 111
  • 105
  • 3
    Oh, you mean, there is no solution to solve this problem(?) right? because this is what DynamoDB intend to. I use LastEvaluatedKey for pagination, like http://blabla.com/users/?limit=10&startKey='' startKey get from LastEvaluatedKey. I want to get 10 datas per http response because i set limit=10, but it can't because of above reason right? – Wooyoung Tyler Kim Oct 20 '16 at 08:09
  • Thanks, for your reply, and i gotta find other way to solve this. – Wooyoung Tyler Kim Oct 20 '16 at 08:11
  • Good. Yes, because of the above reason (i.e. second paragraph last sentence in the bracket). You need to use LastEvaluatedKey until you get 10 records that you expected. The limit in Dynamodb works differently when compared to RDBMS. – notionquest Oct 20 '16 at 08:20
  • Thanks, I solved this problem using global secondary index. – Wooyoung Tyler Kim Oct 24 '16 at 06:45
  • @101110101100111111101101 can you post here how to solve this issue by GSI? – harley Jul 08 '18 at 14:14
  • @AnatoliiB I added to original question's comment. I hope this comment helps you. – Wooyoung Tyler Kim Jul 08 '18 at 15:54
5

Use --max-items=2 instead of --limit=2, max-items will do limit after filtering.

Sample query with max-items:

aws dynamodb query --table-name=limitTest --key-condition-expression="gsikey=:hash AND gsisort>=:sort"  --expression-attribute-values  '{ ":hash":{"S":"1"},    ":sort":{"S":"1"}, ":levels":{"N":"10"}}'   --filter-expression="levels >= :levels"  --scan-index-forward  --max-items=2  --projection-expression "levels,key1" --index-name=gsikey-gsisort-index

Sample query with limit:

aws dynamodb query --table-name=limitTest --key-condition-expression="gsikey=:hash AND gsisort>=:sort"  --expression-attribute-values  '{ ":hash":{"S":"1"},    ":sort":{"S":"1"}, ":levels":{"N":"10"}}'   --filter-expression="levels >= :levels"  --scan-index-forward  --limit=2  --projection-expression "levels,key1" --index-name=gsikey-gsisort-index
John
  • 1,012
  • 14
  • 22
user273098
  • 151
  • 3
  • 10
3

If there is just one field that is of interest for the pagination you could create an index with that field as a key. Then you do not need to recurse for the number of items in the limit.

obotezat
  • 1,041
  • 16
  • 20
1

Limit defines the number of items/records evaluated using only the KeyCondition(If Present) before applying any filters. To solve this problem, as pointed out in earlier solutions, one approach could be to use a GSI, where the filtering condition is part of your GSI key. However, this is fairly restrictive as it's not practical to introduce a new GSI for every access pattern which requires pagination. A more realistic approach could be to query track the Count in the query response and keep querying & appending the next page of results until the aggregated Count satisfies the client-defined Limit. Keep in mind, that you would need to have some custom logic to build the LastEvaluatedKey, which would be required to fetch the subsequent result-page. In Go, this can be achieved in the following way.

func PaginateWithFilters(ctx context.Context, keyCondition string, filteringCondition int, cursor *Cursor) ([]*Records, error) {
     var collectiveResult []map[string]types.AttributeValue
     var records []*Records
     expr, err := buildFilterQueryExpression(keyCondition, filteringCondition)
     if err != nil {
      return nil, err
     }
     queryInput := &dynamodb.QueryInput{
      ExpressionAttributeNames:  expr.Names(),
      ExpressionAttributeValues: expr.Values(),
      KeyConditionExpression:    expr.KeyCondition(),
      FilterExpression:          expr.Filter(),
      TableName:                 aws.String(yourTableName),
      Limit:                     aws.Int32(cursor.PageLimit),
     }
     if cursor.LastEvaluatedKey != nil {
      queryInput.ExclusiveStartKey = cursor.LastEvaluatedKey
     }
     paginator := dynamodb.NewQueryPaginator(dbClient, queryInput)
     for {
      if !paginator.HasMorePages() {
       fmt.Println("no more records in the partition")
       cursor.LastEvaluatedKey = nil
       break
      }
      singlePage, err := paginator.NextPage(ctx)
      if err != nil {
       return nil, err
      }
      pendingItems := int(cursor.PageLimit) - len(collectiveResult)
      if int(singlePage.Count) >= pendingItems {
       collectiveResult = append(collectiveResult, singlePage.Items[:pendingItems]...)
       cursor.LastEvaluatedKey = buildExclusiveStartKey(singlePage.Items[pendingItems-1])
       break
      }
      collectiveResult = append(collectiveResult, singlePage.Items...)
     }
     err = attributevalue.UnmarshalListOfMaps(collectiveResult, &records)
     if err != nil {
      return nil, err
     }
     return records, nil
    }

This Medium article discusses pagination with DynamoDB in a bit more depth and includes code snippets in Go to paginate query responses with filters.

-1

limts add now in dynamodb

var params = {
        TableName: "message",
        IndexName: "thread_id-timestamp-index",
        KeyConditionExpression: "#mid = :mid",
        ExpressionAttributeNames: {
            "#mid": "thread_id"
        },
        ExpressionAttributeValues: {
            ":mid": payload.thread_id
        },
        Limit:  (3 , 2 ,3),
        LastEvaluatedKey: 1,
        ScanIndexForward: false
    };
    req.dynamo.query(params, function (err, data) {
console.log(err, data);
})
-1

You can query with default Limit size such as 200 with fitlerExpression; then you get all the valid result sets, now comparing to your maxResult param, if result sets size > your maxResult param, you can clear out the rest part. and construct the newEvaluatedKey for next page(nextToken)

if(resultSize > maxResult){
    Map<String,AttributeValue> newEvaluatedKey = 
    convertDataToEvaluateKey(queryResult.getResults().get(maxResult-1));
    List<Model> resultSets = queryResult.getResults();
    resultSets.subList(maxResult, unexpiredGroupSize).clear();
    queryResult.setResults(resultSets);
    queryResult.setCount(maxResult);
    queryResult.setLastEvaluatedKey(newEvaluatedKey);
}
Ping Woo
  • 1,423
  • 15
  • 21