0

I know the query below is not supported in DynamoDB since you must use an equality expression on the HASH key. query({ TableName, IndexName, KeyConditionExpression: 'purchases >= :p', ExpressionAttributeValues: { ':p': 6 } }); How can I organize my data so I can efficiently make a query for all items purchased >= 6 times?

Right now I only have 3 columns, orderID (Primary Key), address, confirmations (GSI).

Would it be better to use a different type of database for this type of query?

Daniel Kobe
  • 9,376
  • 15
  • 62
  • 109

2 Answers2

0

You would probably want to use the DynamoDB streams feature to perform aggregation into another DynamoDB table. The streams feature will publish events for each change to your data, which you can then process with a Lambda function.

I'm assuming in your primary table you would be tracking each purchase, incrementing a counter. A simple example of the logic might be on each update, you check the purchases count for the item, and if it is >= 6, add the item ID to a list attribute itemIDs or similar in another DynamoDB table. Depending on how you want to query this statistic, you might create a new entry every day, hour, etc.

Bear in mind DynamoDB has a 400KB limit per attribute, so this may not be the best solution depending on how many items you would need to capture in the itemIDs attribute for a given time period.

You would also need to consider how you reset your purchases counter (this might be a scheduled batch job where you reset purchase count back to zero every x time period).

Alternatively you could capture the time period in your primary table and create a GSI that is partitioned based upon time period and has purchases as the sort key. This way you could efficiently query (rather than scan) based upon a given time period for all items that have purchase count of >= 6.

mixja
  • 6,977
  • 3
  • 32
  • 34
-1

You dont need to reorganise your data, just use a scan instead of a query

scan({
    TableName,
    IndexName,
    FilterExpression: 'purchases >= :p',
    ExpressionAttributeValues: { ':p': 6 }
});
F_SO_K
  • 13,640
  • 5
  • 54
  • 83
  • Why do you say that? You want to search over every item in your table. That is exactly what a scan is for. – F_SO_K Mar 16 '18 at 16:42
  • I see your point. I was hoping there would be a way to partition the data based on confirmations and then make a more efficient query than O(n). Would this be possible? – Daniel Kobe Mar 16 '18 at 21:59
  • Would the operation be more efficient in a different kind of database? I'm asking because this will be a frequent query and doing a scan is costly. – Daniel Kobe Mar 16 '18 at 22:00