0

My own image public.ecr.aws/f6q1r4v8/amazonlinuxwithshell:latest fails to start on AWS (FARGATE) in a very weird way:

Last status Stopped

Stopped reason CannotPullContainerError: inspect image has been retried 5 time(s): failed to resolve ref "public.ecr.aws/f6q1r4v8/amazonlinuxwithshell:latest": failed to do request: Head https://public.ecr.aws/v2/f6q1r4v8/amazonlinuxwithshell/manifests/latest: dial t...

Note that awslogs remain empty (despite with an earlier version of my image they were not empty).

What's wrong, how to make it work?

porton
  • 312
  • 1
  • 14
  • The message is cut off at the end. Please post the complete message. – Michael Hampton Aug 01 '21 at 16:06
  • @MichaelHampton No way, it is the only what AWS shows me. – porton Aug 01 '21 at 16:50
  • It does not work when there is no public instance IP... Why? That's a weird Amazon's bug. – porton Aug 01 '21 at 20:18
  • @porton the infrastructure needs to go out to the public ECR endpoint to pull the image. This could only happen if your task is private and you have a way to route out to the Internet (e.g. NAT GW) or if your task has a public IP address that can route to the ECR endpoint. – mreferre Aug 02 '21 at 07:26
  • @mreferre Amazon should temporarily assign a public IP to such a FARGATE task. Not doing so is a bug. – porton Aug 04 '21 at 21:29
  • I am not sure it's a "bug". It's a networking construct. You either enable it or you don't. Also there are customers that configure that networking construct prescriptively to avoid going out to the Internet, how could AWS possibly override it and temporarily (how long?) allow outbound communications? – mreferre Aug 05 '21 at 08:01
  • @mreferre Certainly AWS should not enable networking communications for the container without an explicit user' request. But why not to enable them for the FARGATE engine itself _while the container is not running._? – porton Aug 06 '21 at 01:39
  • Because this would/could be seen as a policy/governance/security posture limitations for customers that do NOT want ANY internet connectivity in their own VPC. Think about customers that use AWS as an extension of their data center (via direct connect) and have very strict rules re what can go out and not. – mreferre Aug 06 '21 at 09:04
  • In a scenario like that a user would (potentially) be able to run a task that points to . – mreferre Aug 06 '21 at 09:06
  • How did you trigger the ecs task? I'm curious why you were able to get 5 retry attempts. In my case, this error occurs randomly but I only got 1 retry attempt. If I could get 5 attempts, I probably won't see any failures. (btw, I'm using `boto3 run_task()` method) – Zach Dec 20 '21 at 15:32

0 Answers0