I have a requirement to send AWS Backup events - specifically failed backups and backups that had Windows VSS fail on backup to a centralized Opsgenie alerting system. AWS directed us to use EventBridge to parse the JSON object produced by AWS Backups to determine whether the VSS portion failed or not.
SNS is not a viable option because we cannot 'OR' the two rules together in one filter policy, and we only have one endpoint so two subscriptions to the same topic will overwrite one. That said, I did successfully send messages to OpsGenie via SNS. So far with Eventbridge, I have not had any luck.
I have started to implement most of this in terraform. I realize TF has some limitations to using EventsBridge (my two rules cannot be tied to the custom bus I create; I have to do this step manually. Also, I need to create the Opsgenie API integration manually as Opsgenie does not seem to have support for the 'EventBridge' type yet. Only the older version of Cloudwatch events that ties into SNS seems to be there. Below is my terraform for reference:
# This module creates an opsgenie team and will tie in existing emails to the team to use with the integration.
module "opsgenie_team" {
source = "app.terraform.io/etc.../opsgenie"
version = "1.1.0"
team_name = "test team"
team_description = "test environment."
team_admin_emails = var.opsgenie_team_admins
team_user_emails = var.opsgenie_team_users
suppress_cloudwatch_events_notifications = var.opsgenie_suppress_cloudwatch_events_notifications
suppress_cloudwatch_notifications = var.opsgenie_suppress_cloudwatch_notifications
suppress_generic_sns_notifications = var.opsgenie_suppress_generic_sns_notifications
}
# Step commented out since 'Webhook' doesn't work.
#
# resource "opsgenie_api_integration" "opsgenie" {
# name = "api-based-int-2"
# type = "Webhook"
#
# responders {
# type = "user"
# id = data.opsgenie_user.test.id
# }
#
# enabled = true
# allow_write_access = true
# suppress_notifications = false
# webhook_url = module.opsgenie_team.cloudwatch_events_integration_sns_endpoint
# }
resource "aws_cloudwatch_event_api_destination" "opsgenie" {
name = "Test"
description = "Connection to OpsGenie"
invocation_endpoint = module.opsgenie_team.cloudwatch_events_integration_sns_endpoint
http_method = "POST"
invocation_rate_limit_per_second = 20
connection_arn = aws_cloudwatch_event_connection.opsgenie.arn
}
resource "aws_cloudwatch_event_connection" "opsgenie" {
name = "opsgenie-event-connection"
description = "Connection to OpsGenie"
authorization_type = "API_KEY"
# Verified key seems to be valid on integration API
# https://api.opsgenie.com/v2/integrations
auth_parameters {
api_key {
key = module.opsgenie_team.cloudwatch_events_integration_id
value = module.opsgenie_team.cloudwatch_events_integration_api_key
}
}
}
# Opsgenie ID created with the manual integration step.
data "aws_cloudwatch_event_source" "opsgenie" {
name_prefix = "aws.partner/opsgenie.com/MY-OPSGENIE-ID"
}
resource "aws_cloudwatch_event_bus" "opsgenie" {
name = data.aws_cloudwatch_event_source.opsgenie.name
event_source_name = data.aws_cloudwatch_event_source.opsgenie.name
}
# Two rules I need to filter on, commented out as they cannot be tied to a custom bus with
# terraform.
# resource "aws_cloudwatch_event_rule" "opsgenie_backup_failures" {
# name = "capture-generic-backup-failures"
# description = "Capture all other backup failures"
#
# event_pattern = <<EOF
# {
# "State": [
# {
# "anything-but": "COMPLETED"
# }
# ]
# }
# EOF
# }
#
# resource "aws_cloudwatch_event_rule" "opsgenie_vss_failures" {
# name = "capture-vss-failures"
# description = "Capture VSS Backup failures"
#
# event_pattern = <<EOF
# {
# "detail-type" : [
# "Windows VSS Backup attempt failed because either Instance or SSM Agent has invalid state or insufficient privileges."
# ]
# }
# EOF
# }
The event bus and API destination seem to be created correctly, and I can find the API key used to communicate with Opsgenie and use it in postman to hit an Opsgenie endpoint. I manually create the rules and tie them in to the custom bus. I even kept them open, hoping to capture any AWS backup events - nothing yet.
I feel like I'm close, but missing a critical detail (or two). Any help is greatly appreciated.