0

In context of a tracker system, I have a situation, where the user's device deliver location data to backend and the system subsequently queries that data both per user and in bulk. The structure of the data is as follows:

{"user_id": "user_1", "timestamp": "2020-10-31 07:05:10.153777+00:00", "location": "XYZ", "details": "PQR"}

The queries that we need are:

Get all location and details data for X<timestamp<Y

and

Get all location and details data for user_id=P and X<timestamp<Y

The total size of database would be around 10 TB I am a DynamoDb newbie, and am not sure I understand the concept of partitionKey very well. Currently I would plan to use a table with partitionKey as user_id and rangekey as timestamp, and then create a secondary global index with "day" out of timestamp for satisfying the first query.

  • Does anybody have advice about how should the DynamoDb be structured for best scaling and performance?
  • Does anybody have any advice/criticism about the currently suggested structure?
user367231
  • 97
  • 6

1 Answers1

0

I would plan to use a table with partitionKey as user_id and rangekey as timestamp

I think that's a good structure for satisfying your second query. You could specify a user, then filter by the desired date/time range.

For your first query, trying to request X<timestamp<Y might give you trouble. Take a look at this page on constructing a Key Condition Expression:

You must specify the partition key name and value as an equality condition.

In other words, even if you build a GSI on the "day" portion of the timestamp, I'm not aware of a way to do a X<timestamp<Y query directly - the name of a single partition must be given.

Based on what you've said, you could still use a GSI indexed on the "day" portion of your timestamp and query it sequentially, a day at a time.

This is sort of the idea behind write sharding, where you explicitly are controlling the number of partitions in your GSI to allow for direct querying. In your case, creating a GSI indexed on the "day" would give you one partition per day that can be queried directly using an = operator, as is required by dynamodb.

Jsphar
  • 1