1

I have two entities in my dynamo table: User and Order.

Each user has 0..* orders and each order has exactly one associated user. Every order also has a orderDate attribute, that describes when the order was placed. My current table is structured as follows to make retrieving all orders for a specific user efficient:

+--------------+----------------+--------------------------------------+
| PK           | SK             |              Attributes              |
+--------------+----------------+-------------+-----------+------------+
|              |                | name        | firstName | birthDate  |
+--------------+----------------+-------------+-----------+------------+
| USER#userid1 | META#userid1   | Foo         | Bar       | 2000-10-10 |
+--------------+----------------+-------------+-----------+------------+
|              |                | orderDate   |           |            |
+--------------+----------------+-------------+-----------+------------+
| USER#userid1 | ORDER#orderid1 | 2020-05-10  |           |            |
+--------------+----------------+-------------+-----------+------------+

I now have a second access pattern where I want to query all orders (regardless of user) that were placed on a specific day (e.g. 2020-05-10) along with the the user(s) that placed them.

I'm struggling to handle this access pattern in my table design. Neither GSIs nor different primary keys seem to work here, because I either have to duplicate every user item for each day or I can't query the orders together with the user.

Is there an elegant solution to my problem?

mlxyz
  • 62
  • 9
  • Two options: either duplicate the user info and store in the order record or use a second getItem to query the user-specific info. – jellycsc Aug 09 '20 at 14:37
  • @jellycsc Ok, thats what I thought - suboptimal solution either way. If you post it again as an answer, I'll accept it. – mlxyz Aug 11 '20 at 06:46
  • Denormalizing the data (duplicating user info) is part of the solution, but still doesnt address searching by order date. I've supplied an answer below with a solution that wont require multiple queries – Seth Geoghegan Aug 12 '20 at 23:39

2 Answers2

1

Unfortunately, I can't seem to figure out a way to elegantly solve your problem.

You need to either duplicate the user info and store in the order record or use a second getItem to query the user-specific info.

If anyone has better solutions, please let me know.

jellycsc
  • 10,904
  • 2
  • 15
  • 32
1

This is a perfect use case for a secondary index. Here's one way to do it:

You could create a secondary index (GSI1) on the Order item with a Partition Key (GSI1PK) of ORDERS#<orderDate> and a Sort Key (GSI1SK) of USER#<user_id>. It would look something like this:

User Orders

The logical view of your GSI1 would look like this:

enter image description here

GSI1 would now support a query of all orders placed on a specific day.

Keep in mind that denormalizing your data model (e.g. repeating user info in the Order item) is a common pattern utilized in DynamoDB data modeling. Remember, space is cheap! More importantly, you are pre-joining your data to support your applications access patterns. In this instance, I'd add whatever User metadata you need to the Order item so it gets projected into the index.

Make sense?

Seth Geoghegan
  • 5,372
  • 2
  • 8
  • 23
  • 1
    Yes, that is the most efficient way to model the "get all orders by date" part of the problem. What I could indeed do is add a `user` attribute to each order and then fill that with some basic information. That would make updates to a users' attributes extremely inefficient though as I would have to do a scan of all orders in the table. – mlxyz Aug 17 '20 at 13:24
  • In this example, if I want to store the payment details to order then what will be the PK, SK, GSI_2_PK and GSI_2_SK? PK = USER#userId1, SK=Order#orderId1, GSI_2_PK=payment_id_1, GSI_2_SK= 'provider_1' . Please tell me the logic is correct or not. – arun n a Oct 23 '21 at 10:10