26

I am running a Docker image on an ECS cluster to shell into it and run some simple tests. However when I run this:

aws ecs execute-command  \
  --cluster MyEcsCluster \
  --task $ECS_TASK_ARN \
  --container MainContainer \
  --command "/bin/bash" \
  --interactive

I get the error:

The Session Manager plugin was installed successfully. Use the AWS CLI to start a session.


An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later.

I can confirm the task + container + agent are all running:

aws ecs describe-tasks \
  --cluster MyEcsCluster \
  --tasks $ECS_TASK_ARN \
  | jq '.'
      "containers": [
        {
          "containerArn": "<redacted>",
          "taskArn": "<redacted>",
          "name": "MainContainer",
          "image": "confluentinc/cp-kafkacat",
          "runtimeId": "<redacted>",
          "lastStatus": "RUNNING",
          "networkBindings": [],
          "networkInterfaces": [
            {
              "attachmentId": "<redacted>",
              "privateIpv4Address": "<redacted>"
            }
          ],
          "healthStatus": "UNKNOWN",
          "managedAgents": [
            {
              "lastStartedAt": "2021-09-20T16:26:44.540000-05:00",
              "name": "ExecuteCommandAgent",
              "lastStatus": "RUNNING"
            }
          ],
          "cpu": "0",
          "memory": "4096"
        }
      ],

I'm defining the ECS Cluster and Task Definition with the CDK Typescript code:

    new Cluster(stack, `MyEcsCluster`, {
        vpc,
        clusterName: `MyEcsCluster`,
    })

    const taskDefinition = new FargateTaskDefinition(stack, TestTaskDefinition`, {
        family: `TestTaskDefinition`,
        cpu: 512,
        memoryLimitMiB: 4096,
    })
    taskDefinition.addContainer("MainContainer", {
        image: ContainerImage.fromRegistry("confluentinc/cp-kafkacat"),
        command: ["tail", "-F", "/dev/null"],
        memoryLimitMiB: 4096,
        // Some internet searches suggested setting this flag. This didn't seem to help.
        readonlyRootFilesystem: false,
    })
clay
  • 18,138
  • 28
  • 107
  • 192

2 Answers2

45

ECS Exec Checker should be able to figure out what's wrong with your setup. Can you give it a try?

The check-ecs-exec.sh script allows you to check and validate both your CLI environment and ECS cluster/task are ready for ECS Exec, by calling various AWS APIs on behalf of you.

Robert
  • 33,429
  • 8
  • 90
  • 94
mreferre
  • 5,464
  • 3
  • 22
  • 29
  • 9
    That tool is amazing. I was missing the ssmmessages IAM permissions; after adding those, everything worked! Thank you so much! – clay Sep 21 '21 at 19:46
  • 8
    This utility is showing that everything is allowed but it is still giving me "The execute command failed because execute command was not enabled when the task was run or the execute command agent isn’t running." Any other ideas? – Uzair Nov 22 '21 at 05:57
  • @Uzair are you using EC2? Instead of Fargate? I think so you need to update your AMI to the latest. I don't have a link but I read that older versions of AMI's don't support it. GL – Sigex Jan 28 '22 at 22:25
  • The tool shows Exec Enabled for Task: No, but everything else is green. I can't see how to enable exec for the task. It's running on fargate, so it should already be enabled by default, no? I went through the motions of creating a new service, but there's no checkbox or similar to enable exec – ndtreviv Apr 05 '22 at 09:12
  • 7
    Ok, so you can only do it via the command line: `aws ecs update-service --cluster your-cluster-name --enable-execute-command --service your-service-name` – ndtreviv Apr 05 '22 at 09:43
6

Building on @clay's comment

I was also missing ssmmessages:* permissions.

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-exec.html#ecs-exec-required-iam-permissions says a policy such as

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ssmmessages:CreateControlChannel",
                "ssmmessages:CreateDataChannel",
                "ssmmessages:OpenControlChannel",
                "ssmmessages:OpenDataChannel"
            ],
            "Resource": "*"
        }
    ]
}

should be attached to the role used in your "task role" (not for the "task execution role"), although the sole ssmmessages:CreateDataChannel permission does cut it.

The managed policies

arn:aws:iam::aws:policy/AmazonSSMFullAccess
arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
arn:aws:iam::aws:policy/AmazonSSMManagedEC2InstanceDefaultPolicy
arn:aws:iam::aws:policy/AWSCloud9SSMInstanceProfile

all contain the necessary permissions, AWSCloud9SSMInstanceProfile being the most minimalistic.

N1ngu
  • 2,862
  • 17
  • 35