0

I am doing some tests with dynamodb in local and I have seen a behaviour that I can't explain that leads to a particular question. For the context, I was doing my tests with the nodejs SDK V3 using DynamoDBDocumentClient (an utility that will convert javascript object to dynamodb attributes https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/dynamodb-example-document-client.html).

So I simply noticed in local that when I was calling PutItem with a list of map it was a lot faster (300ms) than when calling with a map of list (4s). List of map (case 1):

{
    "L": [
      {
        "M": {
          "Element1": {
            "L": [
              {
                "S": "1"
              },
              {
                "N": "1660964494"
              },
              {
                "S": "1"
              }
            ]
          }
        }
      }, //... more elements continuing here

Map of list (case 2):

{
    "M": {
      "Element1": {
        "L": [
          {
            "S": "1"
          },
          {
            "N": "1660964550"
          },
          {
            "S": "1"
          }
        ]
      },// More and more elements after that...

Also, as expected, depending on the structure of the datas the request will cost more or less capacityUnits. In local for my datas I have: case 1 => 176 CapacityUnits case 2 => 138 CapacityUnits

However since case 1 seems a lot faster than case 2 in local (maybe because of the way dynamodb is storing List and maps) I would like to know if it will be better to use case 2 since it will use less capacityUnits so it means I will pay less I guess. Maybe case 1 is better because there is a cost in dynamodb for the speed of the request? (can't find any documentation on this)

Or maybe it's just a behaviour in local and the speed doesn't correlate with production dynamodb?

Sekki
  • 99
  • 1
  • 2
  • 8
  • 2
    My suggestion is that you pick the data model that best supports your expected access patterns. Also, I wouldn't measure against DynamoDB Local because its results may not be representative of the actual DynamoDB service. – jarmod Aug 20 '22 at 14:55
  • Yes, thank you. Out of curiosity I will test it in production but it appears a bit strange to me that a request costing less CapacityUnits can take more execution time that the other. – Sekki Aug 22 '22 at 00:09

1 Answers1

1

A write capacity unit is 1 KB, suggesting that your list-of-maps and map-of-lists have slightly different lengths - 176 KB vs 138 KB. I don't think it's surprising, even in your example the first example seems a little bit longer. It's not a big difference... But obviously the shorter version has the advantage that it will cost you less to write, store, and finally read. However, I suggest that you measure this on the actual DynamoDB, not on DynamoDB Local which isn't guaranteed to be identical to DynamoDB.

However, you should also keep in mind that having very large items is not a good idea in DynamoDB. There is a hard limit of 400 KB which your 176 KB example comes pretty close to reaching, so if your use case grows a bit it will exceed this hard limit. Also, every read of the 176 KB item will need to read its entirety, and every modification to a small piece will need to re-write the entire 176 KB, costing you a lot.

Instead, an third alternative you can consider instead of a list is a partition: A partition can have multiple items (identified with different sort keys), so instead of a list you can keep the list of items in a partition. This will allow you to read or modify an individual item without paying for reading or writing the entire list - but will also allow you to read the entire list (the partition) when you want to.

Nadav Har'El
  • 11,785
  • 1
  • 24
  • 45
  • Thank you, I'm not sure it is what you are saying with partition but I figured out I can redesign my system to split those big items into multiple items. However I have one other issue, I need to frequently update or add new of those items so it will result in many write operations and I don't know how to avoid it. I can read it easily using sort key or LSI but still need to write and update a lot of items at once. – Sekki Aug 22 '22 at 00:06
  • A "partition" is many items which share the same "partition key" and differ by the value of their "sort key". Instead of having a "list" of values, you can have a partition and each value will be in a separate item. Anyway, DynamoDB prices writes according to their length so a 100KB write will cost the same as 100 writes of 1KB each. You're write that doing tiny writes (less than 1KB each) will cost you too much. – Nadav Har'El Aug 22 '22 at 11:41