How to query dynamoDB using other partition key

Question

My dynamoDB table look like this

And what I want to achieve is:

I want to get all product which the seller is active -> which will return only product with ID #1 -> Yeezy Boost 380.

Right now I can only think this way:

Get all seller with active is true
Get product with the given seller ids (from the step 1) but it needs 2 query to achieve this.

Is there any better way to handle this? Thanks!

Note:

Right now this "microservice" just does 2 things (Access patterns):

List all product that is registered by an active seller
Get product detail with the stocks

The way your data is currently set up, your proposal makes sense. However, fetching active sellers and products by seller id will both require an inefficient `scan` operation. A better approach would be to re-design your data model to support your required access patterns. It's difficult to suggest what data model you should go with, since we don't know all of your access patterns. You may get more meaningful responses if you describe some of your access patterns around sellers and products. — Seth Geoghegan, Feb 08 '21 at 22:51
Hi @SethGeoghegan thanks for the reply. Right now this service just does 2 things: - List all product that is registered by an active seller - Get product detail with the stocks Hope you can help me with suggesting the appropriate data model. Thank you! — JimmyJS, Feb 09 '21 at 05:37
Hi @SethGeoghegan would you help me? because I cannot think of a good design to support my access patterns. — JimmyJS, Feb 11 '21 at 02:03
I see you use "active" to describe a seller. What happens to a sellers products when they are no longer active? In other words, do you have any access patterns around products for inactive sellers? — Seth Geoghegan, Feb 11 '21 at 15:50
@SethGeoghegan right now I don't have any access patterns for the inactive seller. I only need to get all products that are registered by an active seller — JimmyJS, Feb 15 '21 at 04:00

Seth Geoghegan · Answer 1 · 2021-02-16T16:34:45.867

There are many ways to define data models in DynamoDB. The right data model for your application requires you to strike a delicate balance across all your access patterns. I'll illustrate one way to model what you're describing, but it is not the only way. I hope this helps get you un-stuck!

Take a look at the following data model:

In this example, I'm modeling the one-to-many relationship between sellers and products by establishing a seller partition. Each product exists within the seller partition.

I've chosen to model active/inactive sellers by introducing a global secondary index for each active Product. In the above example, notice that Joe Smith (SELLER#1) has GSIPK and GSISK attributes defined, which means Joe Smiths products are "active" and will show up in the GSI. The GSI would look like this:

If you want to search for products from active sellers, you would search the global secondary index. Notice that Jane Smith (SELLER#2) does not have any products in the GSI. That's because none of Jane Smiths Products have GSIPK and GSISK attributes defined. Therefore, if you want to mark a seller (or the sellers products) as active/inactive, you'd simply add/remove the GSIPK/GSISK attributes as needed.

In this example, marking a seller as inactive would require updating each of the sellers products (removing the GSIPK/GSISK). You'd do this with a batch call to update (or delete) multiple products. However, if this introduces a performance bottleneck, you may want a different approach.

You might consider using a time-to-live (TTL) attribute on inactive sellers/products and let DynamoDB handle removing the expired products from your product search. Since TTLs don't remove items immediately, you'd still need to filter for expired products when searching.

As you can see, there are many ways to handle data modeling in DynamoDB. The right fit for your use case depends on a deep understanding of your access patterns. I hope these approaches give you some ideas of the possibilities.

Hi thanks for the answer! but I've some questions: how can I get the product detail with the stocks (my 2nd access patterns), and how is the performance if I have 1 million data and I want to update seller active to inactive? because I need to update the product 1 by 1 or use batch update. do you have any experiences on batch updates in big data? thanks! — JimmyJS, Feb 16 '21 at 05:39
For your first question, you'd add the product detail to the product item. I only showed a few item attributes for illustration purposes, but you could certainly add as many attributes as you need. I've updated my answer to address your second question. It can be hard to solve DDB data modeling problems on StackOverflow. The right patterns will require a deep understanding of your access patterns and will require multiple iterations to get right. Storing 1mm products may require a different approach than storing 1000 products (read/write sharding, for example). — Seth Geoghegan, Feb 16 '21 at 16:31

score 0 · Answer 2 · answered Feb 11 '21 at 12:10

I am not sure about the scale of your data. But what if you store the list of product-id's sold by a seller in seller detail record itself.

Like:

seller-1, detail, TRUE, Jimmy, <product-1>  
seller-2, detail, FALSE, Ray, <product-1, product-2>

So, you can have GSI on active column, in 1 query you can get all active sellers and their products. Of course, you need to get details of those products using other queries but that is the case currently as well.

Also, in this model, whenever you update a product with seller, you need to do it using a transaction by updating seller and product record together.

How to query dynamoDB using other partition key

2 Answers2