I have defined a CloudWatch alarm for AWS ECS scale-in scale-out.
Normally it works fine. but sometimes it fails with below error. 500 is the threshold for scale-out. Metric time is every 5 min. and scale-out datapoint is 1 of 2 (means one value exceeds the threshold in 10 min):
"error": "No step adjustment found for metric value [437.08774491907025, 516.9558339660845] and breach threshold 500.0"
Step adjustment is defined as below:
step_adjustment {
metric_interval_lower_bound = 0
scaling_adjustment = 1
}
Alarm config:
datapoints_to_alarm = "1"
evaluation_periods = "2"
threshold = "500"
Terraform code for alarm creation
resource "aws_appautoscaling_policy" "task_count_up" {
name = "appScalingPolicy_${aws_ecs_service.sqs_to_kinesis.name}_ScaleUp"
service_namespace = "ecs"
resource_id = "service/${aws_ecs_cluster.shared-elb-access-logs-processor.name}/${aws_ecs_service.sqs_to_kinesis.name}"
scalable_dimension = "ecs:service:DesiredCount"
step_scaling_policy_configuration {
adjustment_type = "ChangeInCapacity"
cooldown = "${var.scale_up_cooldown_seconds}"
metric_aggregation_type = "Maximum"
step_adjustment {
metric_interval_lower_bound = 0
scaling_adjustment = 1
}
}
depends_on = [
"aws_appautoscaling_target.main",
]
}
resource "aws_appautoscaling_policy" "task_count_down" {
name = "appScalingPolicy_${aws_ecs_service.sqs_to_kinesis.name}_ScaleDown"
service_namespace = "ecs"
resource_id = "service/${aws_ecs_cluster.shared-elb-access-logs-processor.name}/${aws_ecs_service.sqs_to_kinesis.name}"
scalable_dimension = "ecs:service:DesiredCount"
step_scaling_policy_configuration {
adjustment_type = "ChangeInCapacity"
cooldown = "${var.scale_down_cooldown_seconds}"
metric_aggregation_type = "Minimum"
step_adjustment {
metric_interval_upper_bound = 0
scaling_adjustment = -1
}
}
depends_on = [
"aws_appautoscaling_target.main",
]
}