Sampling rate for data returned with boto3 get_metric_statistics()

Question

The documentation is here...

http://boto3.readthedocs.io/en/latest/reference/services/cloudwatch.html#CloudWatch.Client.get_metric_statistics

Here is our call

response = cloudwatch.get_metric_statistics(
    Namespace='AWS/EC2', 
    MetricName='CPUUtilization', #reported every 5 minutes

    Dimensions=[
        {
            'Name': 'AutoScalingGroupName', 
            'Value': 'Celery-AutoScalingGroup' 
        },
    ],
    StartTime=now - datetime.timedelta(minutes=12),
    EndTime=now,
    Period=60, #I can't figure out what exactly changing this is doing
    Statistics=['Average','SampleCount','Sum','Minimum','Maximum'],
)

Here is our response

>>> response['Datapoints']
[ {u'SampleCount': 5.0, u'Timestamp': datetime.datetime(2017, 8, 25, 12, 46, tzinfo=tzutc()), u'Average': 0.05,  u'Maximum': 0.17, u'Minimum': 0.0, u'Sum': 0.25, u'Unit': 'Percent'},
  {u'SampleCount': 5.0, u'Timestamp': datetime.datetime(2017, 8, 25, 12, 51, tzinfo=tzutc()), u'Average': 0.034, u'Maximum': 0.08, u'Minimum': 0.0, u'Sum': 0.17, u'Unit': 'Percent'}
]

Here is my question

Look at first dictionary in the returned list. SampleCount of 5 makes sense, I guess, because our Period is 60 (seconds) and CloudWatch supplies 'CPUUtilization' metric every 5 minutes.

But if I change Period, to say 3 minutes (180), I am still getting a SampleCount of 5 (I'd expect 1 or 2).

This is a problem because I want the Average, but I think it is averaging 5 datapoints, only 2 of which are valid (the beginning and end, which correspond to the Min and Max, that is the CloudWatch metric at some time t and the next reporting of that metric at time t+5minutes).

It is averaging this with 3 intermediate 0-value datapoints so that the Average is (Minimum+Maximum+0+0+0)/5

I can just get the Minumum, Maximum add them and divide by 2 for a better reading - but I was hoping somebody could explain just exactly what that 'Period' parameter is doing. Like I said, changing it to 360 didn't change SampleCount, but when I changed it to 600, suddenly my SampleCount was 10.0 for one datapoint (that does make sense).

score 1 · Answer 1 · edited Dec 17 '19 at 17:16

Data can be published to CloudWatch in two different ways:

You can publish your observations one by one and let CloudWatch do the aggregation.
You can aggregate the data yourself and publish the statistic set (SampleCount, Sum, Minimum, Maximum).

If data is published using method 1, you would get the behaviour you were expecting. But if the data is published using method 2, you are limited by the granularity of published data.

If ec2 is aggregating the data for 5 min and then publishing statistic set, there is no point in requesting data at 3 min level. However, if you request data with the period that is multiple of the period data was published with (eg. 10 min) stats can be calculated, which CloudWatch does.

Sampling rate for data returned with boto3 get_metric_statistics()

1 Answers1