24

Im trying to use EC2 Container service. Im using terraform for creating it. I have defined a ecs cluster, autoscaling group, launch configuration. All seems to work. Except one thing. The ec2 instances are creating, but they are not register in the cluster, cluster just says no instances available.

In ecs agent log on created instance i found logs flooded with one error:

Error registering: NoCredentialProviders: no valid providers in chain

The ec2 instances are created with a proper role ecs_role. This role has two policies, one of them is following, like docs required:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecs:CreateCluster",
        "ecs:DeregisterContainerInstance",
        "ecs:DiscoverPollEndpoint",
        "ecs:Poll",
        "ecs:RegisterContainerInstance",
        "ecs:StartTelemetrySession",
        "ecs:Submit*",
        "ecs:StartTask"
      ],
      "Resource": "*"
    }
  ]
}

Im using ami ami-6ff4bd05. Latest terraform.

GabLeRoux
  • 16,715
  • 16
  • 63
  • 81
Aldarund
  • 17,312
  • 5
  • 73
  • 104

7 Answers7

37

It was a problem with trust relationships in the role as the role should include ec2. Unfortunately the error message was not all that helpful.

Example of trust relationship:

{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": ["ecs.amazonaws.com", "ec2.amazonaws.com"]
      },
      "Effect": "Allow"
    }
  ]
}
GabLeRoux
  • 16,715
  • 16
  • 63
  • 81
Aldarund
  • 17,312
  • 5
  • 73
  • 104
  • 4
    OMFG, I've been at this for the past 3-4 days, trying absolutely everything and this fixed it. How did you even stumble upon this, I didn't see this anywhere else!! Lots of people mentioning public IPs, roles, etc, etc, etc, but you are the single one person that mentioned the trust relationship. Thank you! – Pedro Mata-Mouros Aug 09 '16 at 13:35
  • And it just worked! This is so stupid Amazon! (hope you are listening) The ECS Cluster was created through your First-Time run wizzard, and everything was done in accordance with the developer manual for ECS, and STILL you wasted 3 hours of my life before I found this answer. – Johan Thomsen Jul 25 '17 at 19:38
  • 27/11/2018 - I'm getting this when trying that policy: This policy contains the following error: Has prohibited field Principal – Kappacake Nov 27 '18 at 16:10
  • I agree. The error reporting for this issue is terrible. – user1607158 Jun 05 '19 at 19:54
  • This is quite old answer and I believe AWS has resolved this problem. Changing IAM role without knowing exactly what you are doing is dangerous. So avoid this solution if possible. – Sarang May 03 '22 at 11:06
2

Make sure you select the correct ECS role in the launch configuration.

enter image description here

Deep Patel
  • 619
  • 7
  • 8
2

I got this error today and figured out the problem: I missed setting the IAM role in launch template (it is under Advanced section). You need to set it to ecsInstanceRole (this is the default name AWS gives - so check if you have changed it and use accordingly).

I had switched from Launch Configuration to Launch Template, and while setting up the Launch Template, I missed adding the role!

Sarang
  • 2,143
  • 24
  • 21
  • I didn't missed the IAM role, in my case the LT lost reference to it. Thank you for your answer here - very helpful after > 24 fun-filled hours of stress!!! – Roy Hinkley Aug 30 '23 at 18:34
1

You might want to add AmazonEC2RoleforSSM (or AmazonSSMFullAccess) to your EC2's role.

Remigiusz
  • 450
  • 3
  • 8
  • This worked for me with terraform batch resource. I had to attach `arn:aws:iam::aws:policy/service-role/AmazonEC2RoleforSSM` policy to `batch_instance_role`. – VitoshKa Jul 02 '20 at 20:39
0

apparently this error message also occurs when an invalid aws-profile is passed.

Rafael Marques
  • 1,335
  • 4
  • 22
  • 35
0

I spent 2 days trying out everything without any luck. I have a standard setup i.e. ecs cluster instance in private subnet, ELB in public subnet, NAT and IGW properly set up in respective security groups, IAM role properly defined, standard config in NACL, etc. Despite everything the ec2 instances wouldnt register with the ecs cluster. Finally I figured out that my custom VPC's DHCP Options Set was configured for 'domain-name-servers: xx.xx.xx.xx, xx.xx.xx.xx' IP address of my org's internal DNS IPs...

The solution is to have following values for the DHCP Options Set: Domain Name: us-west-2.compute.internal (assuming your vpc is in us-west-2), Options: domain-name: us-west-2.compute.internal domain-name-servers: AmazonProvidedDNS

user8898538
  • 11
  • 1
  • 6
-1

if you use taskDefinition , check that you set execution & taskRole ARN's and set correct policies for that roles.

Kampaii
  • 123
  • 2
  • 11
  • I don't think this issue has anything to do with task definition as it happens _after_ instance is registered. – Sarang May 03 '22 at 11:05
  • in my case the problem was I was not giving enough rights to role. So I was changing the role set up for task definition – Kampaii May 16 '22 at 09:22
  • The problem mentioned in the question is instance not getting registered. This happens before tasks run on the instance. Therefore, the role of taskDefinition has no bearing on the result (and hence my downvote). The role defined in launch config / launch template is what matters. Maybe you are using same role in both places? – Sarang May 16 '22 at 13:38