0

This seems like such an elementary part of databases, cannot believe Dynamo does not do this.

Supposing I have a Case. I have 2 dates: when the Case became active, and when it became inactive. I want to write a query that would return the count of active cases for a given Date.

In SQL (and MySQL has special Date indices), I could do an expression 'where :date between active and inactive.' Cannot do this in dynamo for a bunch of reasons:

  • there is no date type
  • there only seem to be concatenated keys since everything is a hash hence no between

So far the only things I have come up with were:

  • Sharding - should probably shard this table, I did some reading on that and the way Dynamo does sharding seems simple, although kinda sucks that you end up with 2 tables
    • if I do this, then I can just ask for the active count each day and store it
    • which means if I wanted count for a day in the past, I would have to table scan, and worse, scan 2 tables (as I understand it)
  • Date Partitions - the problem here is which date do we partition on, I guess activation, then the presumption is a count for a given Date would have a key expression of active <= :date, and a filter expression of inactive is null
  • Distinct Events - if I am recording Events on each case, the count of active cases on a given date is also the distinct set of CaseIDs in the Events table for that date, but that looks like it's not easy to do

Still reading so would not be surprised if I am missing something obvious. Actually one other possible way to do this is move the event data to Timestream and then have it compute this aggregate.

Rob
  • 11,446
  • 7
  • 39
  • 57
  • DynamoDB does not support aggregations (e.g. select count(*) where...). However, there are strategies to overcome that. DynamoDB data modeling is all about understanding the nuance of the access patterns. For example, your access pattern appears to be "fetch count of all active cases between two ad-hoc dates", which is different than "fetch daily/weekly/monthly counts of active cases". The details matter when modeling the data. What is the ratio of active to inactive cases? How many cases will you be searching? How wide a date range? – Seth Geoghegan Jun 24 '21 at 23:34
  • Yeah I should have mentioned that, figured that out. Should also have mentioned that I got this working with filters, goal here was how to do it with indices. Thanks @SethGeoghegan – Rob Jun 27 '21 at 03:27

0 Answers0