0

I am using dynamo db tables for saving the transactional data for my APIs requests. I am maintaining two tables 1. schedule - with SId as hashkey 2. summary - with DynamoDBAutoGeneratedKey (UUID) as hashkey and SId as an Attribute to it.

schedule table populates a single row per request, whereas the summary table populates 10 items per SId and unique UUID

We are running a load test on these two tables and it is observed that schedule table is performing well but the summary table is consuming a lot of time in PutRequests for the 10 items per call.

Can any one suggest on performance tuning for my summary dynamodb table? Can keeping a UUID as hashkey, slow down the PutItemRequest?

Any help pointers are much appreciated.

Also, we have activated the streams on these tables which is consumed by lambda for cross replication.

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
MG_7
  • 61
  • 3
  • 10
  • Using UUID as partition key doesn't slow down the put request. Actually, it is a best practice to have UUID as partition key. However, are you inserting 10 items for a UUID in the same request? It may slow down the write as it would go to the same partition. Have you tried increasing the write capacity units? – notionquest Jul 21 '17 at 17:09
  • What do you mean by "transactional data"? – Ivan Mushketyk Jul 24 '17 at 12:10

2 Answers2

0

Few things to consider:

1) Is your database throughput sufficiently high for a given load test? Note that if you have multiple partitions, then throughput will be divided between them, although if you're using random UUID for each write then you should not have hot partition problem on write.

2) Is it definitely the database that's getting slow or is it the application? Could it be that you are performing writes sequentially and not in parallel or maybe using sync calls instead of async calls

3) Have you looked at dynamoDB metrics in your console? You should be able to see metrics such as average put latency and throttled requests there. This can potentially shed some light for you

Tofig Hasanov
  • 3,303
  • 10
  • 51
  • 81
  • Earlier i was using 10 WCU on which the load test was performed. Changing the WCU to 50 with the following details for the load test decreased the response time. The load was done using 3 users for 1 hr. This time I analyzed the matrics of that table and there was Put latency and Scan latency shooted up but there were no Throttled write requests observed – MG_7 Jul 24 '17 at 11:13
0

Few thing that comes to mind:

  • Are you using scans by any chance? This would explain performance degradation, since scans do not exploit any knowledge about how data is organised in DynamoDB and are simply a brute force search. You should avoid using scans since they are inherently slow and expensive.

  • Do you have a "hot partition"? You wrote:

  1. schedule - with SId as hashkey 2. summary - with DynamoDBAutoGeneratedKey (UUID) as hashkey and SId as an Attribute to it.

Is access to these values uniformly distributed? Do you have items that are accessed more often then others? If so, this may be an issue, if majority of your reads/writes comes to a small subset of ids, than it means that you are flooding a single partition (physical machine) with requests. I would suggest to investigate this as well.

One solution can be to use cache and store frequently accessed items there. You can use either ElasticCache or DAX - a new caching solution in Dynamo.

You can find out more about hot partitions here and here.

  • Are you using transactions? You wrote:

I am using dynamo db tables for saving the transactional data

If by this you mean that you are using DynamoDB transactions, you need to read how DynamoDB implements transactions.

Long story short, DynamoDB is storing copies of all items that you update/delete/add when you perform a transaction. Additionally, DynamoDB transactions are expensive and they require 7N+4 writes per transaction, where N is a number of items involved in a transaction.

Ivan Mushketyk
  • 8,107
  • 7
  • 50
  • 67