0

I'm trying to use DynamoDB for a project and I need some clarifications about the table schema design and indexes definition. Let me to provide you an example of the data that I need to store.

Suppose that I created a table with this schema:

  • ID (Unique key)
  • User-ID
  • Brand
  • Car-Name
  • Car-Reference
  • ... additional attributes

As you can see, I'm using a Primary-key (ID) that it's an unique-id so I don't need to have a sort-key also. Based on the queries that I need to do, I created some GSI (Global-Secondary-Index) indexes also:

  • I need to query the table by User-ID and Brand. So I created the Index_1 using User-ID as HASH and Brand as RANGE

  • I need to query the table by Car-Reference. So I created the Index_2 using Car-Reference as HASH

This from the application point of view seems works fine. But I have some questions here:

  • I created GSI indexes but I'm asking myself.. When is the LSI (Local Secondary Index) necessary? From what I understand, it's not possible to create LSI if you are not using a Sort-Key. Is this correct? I understand also that no additional costs are required using LSI, for GSI there are some additional costs instead.
  • So, at the moment, from what I described with this example, do you see something wrong or something that could be improved?

Now, suppose that I need to store additional information linked to the User-ID. Just to reuse the same example, we can suppose that we need to store the amount of the customer-service-requests that the user opened for each brand (I know, stupid example ... but let me assume it.. so we need to store a counter and the last request-date linked to the user-id). I have some doubts here:

  • In order to reduce the costs, maybe it could be better to avoid creating a dedicated table.. so I'm thinking of storing this info into the same table also. Is it a bad design?

  • Supposing that this could be ok, now I have to store raws like this:

    • ID
    • User_ID
    • Brand
    • Requests_Count
    • Last request date

So, I need these possible queries:

  • By User-ID and Brand
  • Maybe some query like "Last request date" < P-date. But I'm not sure about it, so we can skip it for the moment.

My doubts here is:

On this table I already have a GSI for search by User-ID and Brand. I need a way to distinguish if the result rows are related to User-Counter or Car-row-info right? So.. I basically need one more info in order to search by:

User-ID - Brand - InfoType (or something like this)

How can I manage it? I'm a bit confused.

Safari
  • 11,437
  • 24
  • 91
  • 191

1 Answers1

0

LSI

I created GSI indexes but I'm asking myself.. When is the LSI (Local Secondary Index) necessary? From what I understand, it's not possible to create LSI if you are not using a Sort-Key. Is this correct? I understand also that no additional costs are required using LSI, for GSI there are some additional costs instead.

Additional costs are imposed on an LSI, you just share throughput with the base table, you still pay for additional storage also. You cannot change the partition key on an LSI, so it does not fit your use-case. LSI's are used when you need to sort the data by a different attribute, and also maintain strong consistency.

Data Model

So, at the moment, from what I described with this example, do you see something wrong or something that could be improved?

You use a unique ID for the base table, but you failed to state if you require to obtain the data using that ID? Its always best to model your base table for the most prominent access patterns, using indexes only to supplement access patterns which cannot be obtained using the base table.

Would the following data model work:

pk sk data
USERID#BRAND TIMESTAMP DATA
00111#Volkswagen 2023-01-01T00:00:000 some data
00111#Toyota 2023-02-01T00:00:000 some data
Leeroy Hannigan
  • 11,409
  • 3
  • 14
  • 31
  • Yes, I mentioned only the queries that I need on the indexes, but I need also to query by ID. – Safari Jul 21 '23 at 14:52
  • Using the approach that you suggested, this means that I'll use a sort-key also..so in this case I can convert the GSI indexes to LSI right? – Safari Jul 21 '23 at 15:40
  • Why do you want an LSI? – Leeroy Hannigan Jul 21 '23 at 16:39
  • I don’t want LSI, I would like just to understand if my way is good. supposing that I’ll choose what you are suggesting, in this way what are the pro and cons usign the GSI or LSI instead? In my case, Is the GSI better then the LSI? – Safari Jul 21 '23 at 22:07
  • GSI is always better than an LSI, unless you simply want to order the data in a different way, but with the same partition key and have strongly consistent reads. An LSI does not fit your requirements. – Leeroy Hannigan Jul 21 '23 at 23:49
  • BTW, as I wrote, I need to use an unique pk to identify the car. sussposing that I’ll use the same table to store the user count info, I understand that the schema that you are suggesting it’s not valid. How I can adjust it to statisfy the requests? – Safari Jul 24 '23 at 13:19
  • I created one more question in order to explain better my specific doubts with one specific example also – Safari Jul 26 '23 at 15:17