0

I have been looking at DynamoDB to create something close to a transaction. I was watching this video presentation: https://www.youtube.com/watch?v=KmHGrONoif4 in which the speaker shows around the 30 minute mark ways to make dynamodb operation close to ACID compliant as can be. He shows the best concept is to use dynamodb streams, but doesn't show a demo or an example. I have a very simple scenario I am look at and that is I have one Table called USERS. Each user has a list of friends. If two users no longer wish to be friends they must be removed from both of the user's entities (I can't afford for one friend to be deleted from one entity, and due to a crash for example, the second user entities friend attribute is not updated causing inconsistent data). I was wondering if someone could provide some simple walk-through oh of how to accomplish something like this to see how it all works? If code could be provided that would be great to see how it works.

Cheers!

user2924127
  • 6,034
  • 16
  • 78
  • 136

2 Answers2

0

Here is the transaction library that he is referring: https://github.com/awslabs/dynamodb-transactions You can read through the design: https://github.com/awslabs/dynamodb-transactions/blob/master/DESIGN.md

Here is the Kinesis client library: http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html

When you're writing to DynamoDB, you can get an output stream with all the operations that happen on the table. That stream can be consumed and processed by the Kinesis Client Library.

In your case, have your client remove it from the first user, then from the second user. In the Kinesis Client Library when you are consuming the stream and see a user removed, look at who he was friends with and go check/remove if needed - if needed the removal should probably done through the same means. It's not truly a transaction, and relies on the fact that KCL guarantees that the records from the streams will be processed.

To add to this confusion, KCL uses Dynamo to store where in the stream is at when processing and to checkpoint processed records.

Mircea
  • 10,216
  • 2
  • 30
  • 46
0

You should try and minimize the need for transactions, which is a nice concept in a small scale, but can't really scale once you become very successful and need to support millions and billions of records.

If you are thinking in a NoSQL mind set, you can consider using a slightly different data model. One simple example is to use Global Secondary Index on a single table on the "friend-with" attribute. When you add a single record with a pair of friends, both the record and the index will be updated in a single action. Both table and index will be updated in a single action, when you delete the friendship record.

If you choose to use the Updates Stream mechanism or the Global Secondary Index one, you should take into consideration the "eventual consistency" case of the distributed system. The consistency can be achieved within milli-seconds, but it can also take longer. You should analyze the business implications and the technical measures you can take to solve it. For example, you can verify the existence of both records (main table as well as the index, if you found it in the index), before you present it to the user.

Guy
  • 12,388
  • 3
  • 45
  • 67