14

AWS Cloudwatch receives a count of 1 every time I start an image download. I am downloading 1,000s of images (on a cluster of EC2 instances) and would like to track the total progress.

I can't find any documentation on how to plot the cumulative sum of a metric. The AWS Cloudwatch Math Expressions looked promising, but they do not have an integrate function.

Currently, I can plot the sum of the started image downloads but only for periods, as seen below. Ideally, I'd like to plot the integral of this plot:

AWS Cloudwatch Metrics

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Alex Walczak
  • 1,276
  • 1
  • 12
  • 27

4 Answers4

19

You can get a cumulative sum over the current range by using the SUM() function that is operated over the original range containing only the number One (1). Remember, you're looking for a single number in the end, so it's not much of a graph, but you need to turn the single value sum back into a time-series.

  • Define m1 as your metric. This is the metric you will want to use SUM() on.
  • Define an expression e1 as m1/m1. This results in a time-series with every value equal to 1. This is what will allow you convert that SUM back to a time-series.
  • Define an expression e2 as SUM(m1) / e1. This is, effectively, the cumulative sum of m1 divided by one for every data-point in the original time-series. It will be a horizontal line on the graph, which will have every point on that horizontal line being the cumulative sum of metric m1. This is required because Cloudwatch can only plot a time-series on the chart, not a single value.
  • Make m1 and e1 invisible. You need them, but you don't need to see them.
  • Finally, change the chart type from Line to Number, since you only wanted the cumulative sum anyway.

The reason you can't use SUM() directly is because it is a single value. By dividing by a time-series containing all 1's, the entire graph is the result of the SUM(). Then, changing the chart to a Number effectively hides all the math and presents only the "final result".

User51
  • 887
  • 8
  • 17
11

Looks like RUNNING_SUM() has been added that does what your need:

Graph with RUNNING_SUM

You can find RUNNING_SUM() under "Add math"->"All functions"

Yves M.
  • 29,855
  • 23
  • 108
  • 144
5

You are correct. All Amazon CloudWatch metrics are for a defined period.

The maximum period for a metric is one day, so this is not suitable for a cumulative counter that you wish to continue beyond one day.

You would need to find an alternate method of storing the count, such as an Amazon DynamoDB table. Use an atomic counter via UpdateItem to increment the count.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • Ended up doing this. However, I need to download 100,000,000+ things, so this can get expensive with a high write capacity (all to use only a single counter). Happy to hear any suggestions to avoid this scaling issue. – Alex Walczak Apr 09 '18 at 21:23
  • Amazon DynamoDB has a burst capability, so you don't need to get the capacity exactly right. An alternative method is to send a message to SQS and process the messages separately, which means you would need very little DynamoDB write capacity. (eg Trigger a Lambda function each hour or minute, read the messages and increment the count.) – John Rotenstein Apr 09 '18 at 21:46
  • Thanks for your suggestion. I didn't use SQS to do this at first because it doesn't support high resolution monitoring, but I'll take another look – Alex Walczak Apr 10 '18 at 07:10
1

You can also use a very long period.

Change your stat to SUM, and set your metric's period to 7 days. You'll get a time series of 1 point with the cumulative sum of all the downloads.

If you give each download a unique dimension value, you can keep your queries separate.

Heath Borders
  • 30,998
  • 16
  • 147
  • 256