4

Consider a simple blog post schema has the following columns

ID 
Author 
Category 
Status 
CreatedDateTime
UpdatedDateTime

So assume the following queries

  • query by ID
  • query by Author, paginated
  • query by (Author, Status), sorted by CreatedDateTime, paginated
  • query by (Category, Status), sorted by CreatedDateTime, paginated

So seems without doing much works, SimpleDB would be more easy to implement the codes?

Howard
  • 19,215
  • 35
  • 112
  • 184
  • did you check https://stackoverflow.com/questions/55340000/how-to-model-a-forum-using-amazon-dynamodb-without-hot-partitions – best wishes May 21 '19 at 05:45

2 Answers2

2

SimpleDB is barely supported by AWS any more - you can't even find it in the AWS console, so while it may work for you, personally I would be deciding between DynamoDB and DocumentDB (assuming you want NoSQL) - don't think there is any reason to start a new project on such an old offering at this point.

E.J. Brennan
  • 45,870
  • 7
  • 88
  • 116
  • It's cheap cheap cheap. That's why Amazon stopped promoting it. And it's great for small projects with low performance requirements. – jbrown Dec 13 '19 at 05:05
1

You should use DynamoDB because it has a lot of useful features such as Point in Time Recovery, transactions, encryption-at-rest, and activity streams that SimpleDB does not have.

If you're operating on a small scale, DynamoDB has the advantage that it allows you to set a maximum capacity for your table, which means you can make sure you stay in the free tier.

If you're operating at a larger scale, DynamoDB automatically handles all of the partitioning of your data (and has, for all practical purposes, limitless capacity), whereas SimpleDB has a limit of 10 GB per domain (aka "table") and you are required to manage any horizontal partitioning across domains that you might need.

Finally, there are signs that SimpleDB is already on a deprecation path. For example, if you look at the SimpleDB release notes, you will see that the last update was in 2011, whereas DynamoDB had several new features announced at the last re:Invent conference. Also, there are a number of reddit posts (such as here, here, and here) where the general consensus is that SimpleDB is already deprecated, and in some of the threads, Jeff Barr even commented and did not contradict any of the assertions that SimpleDB is deprecated.


That being said, in DynamoDB, you can support your desired queries. You will need two Global Secondary Indexes, which use a composite sort key. Your queries can be supported with the following schema:

  • ID — hash key of your table
  • Author — hash key of the Author-Status-CreatedDateTime-index
  • Category — hash key of the Category-Status-CreatedDateTime-index
  • Status
  • CreatedDateTime
  • UpdatedDateTime
  • Status-CreatedDateTime — sort key of Author-Status-CreatedDateTime-index and Category-Status-CreatedDateTime-index. This is a composite attribute that exists to enable some of your queries. It is simply the value of Status with a separator character (I'll assume it's # for the rest of this answer), and CreatedDateTime appended to the end. (Personal opinion here: use ISO-8601 timestamps instead of unix timestamps. It will make troubleshooting a lot easier.)

Using this schema, you can satisfy all of your queries.

query by ID: Simply perform a GetItem request on the main table using the blog post Id.

query by Author, paginated: Perform a Query on the Author-Status-CreatedDateTime-index with a key condition expression of Author = :author.

query by (Author, Status), sorted by CreatedDateTime, paginated: Perform a Query on the Author-Status-CreatedDateTime-index with a key condition expression of Author = :author and begins_with(Status-CreatedDateTime, :status). The results will be returned in order of ascending CreatedDateTime.

query by (Category, Status), sorted by CreatedDateTime, paginated: Perform a Query on the Category-Status-CreatedDateTime-index with a key condition expression of Author = :author and begins_with(Status-CreatedDateTime, :status). The results will be returned in order of ascending CreatedDateTime. (Additionally, if you wanted to get all the blog posts in the "technology" category that have the status published and were created in 2019, you could use a key condition expression of Category = "technology" and begins_with(Status-CreatedDateTime, "published#2019").

The sort order of the results can be controlled using the ScanIndexForward field of the Query request. The default is true (sort ascending); but by setting it to false DynamoDB will return results in descending order.

DynamoDB has built in support for paginating the results of a Query operation. Basically, any time that there are more results that were not returned, the query response will contain a lastEvaluatedKey which you can pass into your next query request to pick up where you left off. (See Query Pagination for more details about how it works.)


On the other hand, if you're already familiar with SQL, and you want to make this as easy for yourself as possible, consider just using the Aurora Serverless Data API.

Matthew Pope
  • 7,212
  • 1
  • 28
  • 49
  • This is a good answer. Since the OP hasn't created the table yet, it might be wise to consider [local secondary indexes](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LSI.html) over global. Also, the `Author-Status-CreatedDate` index might not produce the desired order for the query by just `Author`. Depends on what the OP wants. For that reason, consider setting the range key of the table to just `Author`. In summary, hash key of `ID`, range key of `Author`, and the two secondary indexes that Mathew laid out. – bigh_29 May 31 '19 at 13:47
  • Range key of Author as @bigh_29 has described will not allow you to query by author. And best practice is to always prefer GSIs over LSIs. – Matthew Pope May 31 '19 at 14:38
  • Gah, you are correct on the range key if `ID` is the partition key. It would only work if all the articles for an author were hashed into the same partition. This kills the idea of using LSIs as well. A different hash key would be needed for either of those ideas to work. – bigh_29 May 31 '19 at 15:02