1

I wonder how DAX works with time-series. I want to insert some data every minute, add TTL to remove it after 14 days and get last 3 hours of data after each insert:

  • insert 1KB each minute
  • expire after 14 days
  • after each insert read data for the last 3 hours

3 hours is 180 minutes, so most of the time I need the last 180 items. Sometimes data is not coming for some time, so there may be less than 180 items.

So there are 20,160 items ±19MB of data for 14 days. How much DAX I will use while fetching the last 3 hours of data every minute? Will it be 19MB or 180KB?

let params = {
    TableName: 'prod_server_data',
    KeyConditionExpression: 's = :server_id and t between :time_from and :time_to',
    ExpressionAttributeValues: {
      ':server_id': serverId, // string
      ':time_from': from,     // timestamp
      ':time_to': to,         // timestamp
    },
    ReturnConsumedCapacity: 'TOTAL',
    ScanIndexForward: false,
    Limit: 1440, // 24h*60 = 1440. 1 check every 1 min
  };

  const queryResult = await dynamo.query(params).promise();
Lukas Liesis
  • 24,652
  • 10
  • 111
  • 109
  • 1
    One clarifying question before I answer: do you retrieve the items using GetItem/BatchGetItem or Query/Scan? – Jeff Hardy Jul 16 '18 at 16:42
  • @JeffHardy Sorry for not adding code, adding now. I use `Query` while it can take several items and cost less RCU if I understand it correctly. Because most of the items are ±800bytes so `180 * 800 = 62.5KB` which is ±8 RCU (0.5/4KB) and with BatchGetItem it would cost 90 RCU while counting each item as separate rounding to 4KB, isn't it? – Lukas Liesis Jul 16 '18 at 21:19

1 Answers1

1

DAX caches items and queries separately, and the query cache stores the entire response, keyed by the parameters. In this case, set the query TTL to 1 minute, and make sure that :time_from and :time_to only have 1 minute resolution.

If you only call query once per minute, than you won't see much benefit from DAX (since it will have to go to DynamoDB every time to refresh).

If you call query multiple times per minute but only expect the data to update every minute (i.e. repeatedly refreshing a dashboard) there will only be 1 call to DynamoDB every minute to refresh and all other requests will be served from the cache.

Jeff Hardy
  • 7,632
  • 24
  • 24
  • Thanks for the answer. I insert new item and call for last ±180 items after this. So each minute I get 1 new and 179 same that I got last minute. I'd like to cache each new result for 3 hours while for next 3 hours I will retrieve the same item again and again. It will just be deeper in the array each minute and after 3hours it will be out from results. I usually don't call many times for the same thing during the same minute so cache for 1 minute would not help. Haven't read this doc yet, reading now. – Lukas Liesis Jul 16 '18 at 23:08
  • I have to read one-by-one to cache it, as I want isn't it? So it would cost more RCU to read it but eventually would save RCU for reading from DAX. – Lukas Liesis Jul 16 '18 at 23:18
  • 1
    If the keys are predictable you could use BatchGetItem, and then all but the most recent will be fetched from the cache instead of DDB. (If you want more guidance the https://forums.aws.amazon.com/forum.jspa?forumID=131 might be a better choice for discussion). – Jeff Hardy Jul 17 '18 at 18:10