I've just got done creating a regularly scheduled lambda function that calls the ssm send-message API to invoke a shell script sitting on an EC2 instance. This script pings various services running on the instance. If any are unhealthy, the script returns a non-zero exit code and I get an email notification. Regardless of the exit code, the script's stderr and stdout go to an S3 bucket of my choosing.
The IaC is all in the CDKv2. I've removed some of the extraneous stuff, like scheduling the function, so you can focus on the IAM components. There are two important components:
- A service role for publishing to SNS (documented here as Tasks 2-3)
- The Lambda Function's role (documented in the same place as Tasks 4-5). In the documentation for task 4, they say use "the AmazonSSMFullAccess managed policy, or a policy that provides comparable permissions." The policies in my code that provide "comparable" (i.e. scoped down) permissions are
ssmSendMessagePolicy
and passRolePolicy
, which get added to the Lambda function's default execution role as inline policies.
There's one more important IAM component with respect to writing the script's output to S3, but I'll cover that later.
const { readFileSync } = require('fs');
import * as path from 'path';
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as sns from "aws-cdk-lib/aws-sns";
import * as subscriptions from 'aws-cdk-lib/aws-sns-subscriptions';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as iam from 'aws-cdk-lib/aws-iam';
interface Ec2InstanceConfig {
instanceId: string,
}
export class HealthCheckStack extends cdk.Stack {
constructor(scope: Construct, id: string, props: cdk.StackProps) {
super(scope, id, props);
// This JSON config has stuff like EC2 instance ids, bucket names, etc
const config = JSON.parse(readFileSync(path.join(__dirname, '..', 'config.json')).toString())
// Bucket for storing ssm send-message responses
const ssmSendCommandLogBucket = s3.Bucket.fromBucketName(
this,'ssmSendCommandLogBucket',
config.ssmSendCommandLogBucket
)
// SNS Topic and subscriptions
const infaAlarmTopic = new sns.Topic(this, 'infaAlarmTopic', {
displayName: 'infaAlarmTopic',
topicName: 'infaAlarmTopic',
})
config.alarmSubscribers.forEach((email: string) => {
infaAlarmTopic.addSubscription(new subscriptions.EmailSubscription(email));
})
// Service Role for SSM to publish to SNS
const servicePolicy = new iam.PolicyStatement({
actions: [
'sns:Publish',
],
resources: [infaAlarmTopic.topicArn],
effect: iam.Effect.ALLOW,
})
const serviceRole = new iam.Role(this, 'serviceRole', {
assumedBy: new iam.ServicePrincipal("ssm.amazonaws.com"),
})
serviceRole.attachInlinePolicy(
new iam.Policy(this, `servicePolicy`, {
statements: [servicePolicy],
})
)
// One Lambda Function per EC2 Instance (although you can send commands to multiple instances, I haven't worked that out yet)
const ec2Instances: Ec2InstanceConfig[] = config.ec2Instances;
ec2Instances.forEach(ec2Instance => {
// Lambda Function
const healthCheckLambda = new lambda.Function(this, `healthCheckLambda${ec2Instance.instanceId}`, {
runtime: lambda.Runtime.PYTHON_3_9,
handler: 'app.handler',
code: lambda.Code.fromAsset(path.join(__dirname, '..', 'lambdas', 'health_check')),
})
// Environment Variables (you need to provide these to the send-message command)
healthCheckLambda.addEnvironment('SNS_TOPIC_ARN', infaAlarmTopic.topicArn)
healthCheckLambda.addEnvironment('SERVICE_ROLE_ARN', serviceRole.roleArn)
// IAM for Lambda
const ssmSendMessagePolicy = new iam.PolicyStatement({
actions: [
'ssm:SendCommand',
],
resources: [
`arn:aws:ssm:${props.env?.region}::document/AWS-RunShellScript`,
`arn:aws:ec2:${props.env?.region}:${props.env?.account}:instance/${ec2Instance.instanceId}`,
],
effect: iam.Effect.ALLOW,
})
const passRolePolicy = new iam.PolicyStatement({
actions: [
'iam:PassRole',
],
resources: [serviceRole.roleArn],
effect: iam.Effect.ALLOW,
})
healthCheckLambda.role?.attachInlinePolicy(
new iam.Policy(this, `SMSendCommandPolicyHealthCheckLambda${ec2Instance.instanceId}`, {
statements: [ssmSendMessagePolicy],
})
)
healthCheckLambda.role?.attachInlinePolicy(
new iam.Policy(this, `passRolePolicyHealthCheckLambda${ec2Instance.instanceId}`, {
statements: [passRolePolicy],
})
)
})
}
}
Here's the ssm send-command call in the Lambda Function (Python using boto3
SDK for AWS):
import os
import boto3
BUCKET_NAME = os.getenv("BUCKET_NAME")
INSTANCE_ID = os.getenv("INSTANCE_ID")
SNS_TOPIC_ARN = os.getenv("SNS_TOPIC_ARN")
SERVICE_ROLE_ARN = os.getenv("SERVICE_ROLE_ARN")
USER = os.getenv("USER")
ssm_client = boto3.client("ssm")
def handler(event, context):
response = ssm_client.send_command(
InstanceIds=[
INSTANCE_ID,
],
DocumentName='AWS-RunShellScript',
TimeoutSeconds=60,
Comment='Health check',
Parameters={
'commands': [f"""sudo -H -u {USER} bash -c '/path/to/my/health_check.sh'"""]
},
OutputS3BucketName=BUCKET_NAME,
OutputS3KeyPrefix="ssm-send-commands",
NotificationConfig={
'NotificationArn': SNS_TOPIC_ARN,
'NotificationEvents': [
'Failed',
],
'NotificationType': 'Command'
},
ServiceRoleArn=SERVICE_ROLE_ARN,
)
At this point, you're probably wondering where the permissions to write the script's output to s3 are. I couldn't find any documentation on this, as that SERVICE_ROLE_ARN
is specific to SNS as per the documentation: "The ARN of the Identity and Access Management (IAM) service role to use to publish Amazon Simple Notification Service (Amazon SNS) notifications for Run Command commands." What I found to work is providing these permissions via the EC2 instance's IAM role. The intuition behind that is that the ssm agent is running on the instance so it would pick up the instance role when writing to s3. The action s3:PutObject
scoped to the prefix of the bucket (ssm-send-commands
in my example) would satisfy the least-privilege requirement.