0

Set up-1:(Not Working)

I have a task running in the ECS cluster. But it's going down because of a health check immediately after it started.

My service is spring boot based which has both traffic(for service calls) and management ports(for health check). I have "permitAll() permission for "*/health" path.

PFA: I configured the same by selecting the override port option in the TG health check tab as well.

enter image description here Set up-2: (Working Fine) I have the same setup in my docker-compose file and I can access health check endpoint in my local container. This is how I defined in my compose:

service:
  image: repo/a:name
  container_name: container-1
  ports:
    - "9904:9904" # traffic port
    - "8084:8084". # management Port

So, I tried configuring the management port on Task Def in the container section. I tried updated the corresponding service for this latest revision of the TD, but when I save this service, I'm getting an error. Is this the right way of handling this?

Error in ECS console:

Failed updating Service : The task definition is configured to use a dynamic host port, 
but the target group with targetGroupArn arn:aws:elasticloadbalancing:us-east-2:{accountId}:targetgroup/ecs-container-tg/{someId} has a health check port specified.
Service

Two possible resolutions:

  1. Is there a way I can specify this port mapping in the docker file?
  2. Another way to configure the management port mappings in the container config of task definition within ECS? (Prefered)
  3. Get rid of Spring Boot's actuator endpoint and implement our own endpoint for health? (BAD: As I need to implement lot of things to show all details which is returned by spring boot)
Adiii
  • 54,482
  • 7
  • 145
  • 148
Sravan
  • 1
  • 1
  • 3

2 Answers2

0

The task definition is configured to use a dynamic host port but target has a health check port specified.

Base on the error it seems like you have configured dynamic port mapping in Task definition, you can verify this in task definition.

enter image description here

understanding-dynamic-port-mapping-in-amazon-ecs

So in dynamic port, ECS schedule will assign and publish random port in the host which will be different than 8082, so change the health check setting accordingly to traffic port.

enter image description here

this will resolve the health issue, now come to your query

Is there a way I can specify this port mapping in the docker file?

No, port mapping happen at run time not at build time, you can specify that in task definition.

Another way to configure the management port mappings in the container config of task definition within ECS? (Prefered)

You can assign static port mapping which mean both publish port and expose will be same 8082:8082 in this health check will work by using static port mapping.

Get rid of Spring Boot's actuator endpoint and implement our own endpoint for health? (BAD: As I need to implement lot of things to show all details which is returned by spring boot)

Healthcheck is simple HTTP Get a call that ALB expecting 200 HTTP status code in response, so you can create a simple endpoint that will return 200 HTTP status code.

Adiii
  • 54,482
  • 7
  • 145
  • 148
  • Thanks for your response. So, for the time being, I implemented a simple endpoint in my service. The response of this API is: Response body: { "status": true, "details": "Server is up & running" } Am still facing an issue as "due to (reason Health checks failed with these codes: [502])" – Sravan Jul 10 '20 at 06:50
  • okay now check ECS event tab is there any event? or is the application really running? are able to ssh to instance? – Adiii Jul 10 '20 at 07:13
  • Yes, there are no. of events every 3 minutes, it's starting a new task by killing old task. That's where I found this error message 502. EC2 is running, I can ssh as well. I think the issue is, ALB can't read the response I'm sending? Because I got the same issue when I had old health check. This means ALB can reach health endpoint in both(simple endpoint and framework's endpoint) of my cases. – Sravan Jul 10 '20 at 13:51
  • so it seems task is failing , do ssh and run `curl localhost:PORT/health`? if this responding inside instance the we can investigate at ALB level – Adiii Jul 10 '20 at 13:53
0

So, after 2 days of doing different things:

  1. In task definition, the networking mode should be "Bridge" type
  2. In task definition, leave the CPU and memory units empty. Providing them at the container level should be enough.
Sravan
  • 1
  • 1
  • 3