1

I have a FastAPI app that is hosted on EC2 instance with ELB for securing the endpoints using SSL.

The app is running using a docker-compose.yml file

version: '3.8'

services:

  fastapi:
    build: .
    ports:
      - 8000:8000
    command: uvicorn app.main:app --host 0.0.0.0 --reload
    volumes:
      - .:/kwept
    environment:
      - CELERY_BROKER_URL=redis://redis:6379/0
      - CELERY_RESULT_BACKEND=redis://redis:6379/0
    depends_on:
      - redis

  worker:
    build: .
    command: celery worker --app=app.celery_worker.celery --loglevel=info --logfile=app/logs/celery.log
    volumes:
      - .:/kwept
    environment:
      - CELERY_BROKER_URL=redis://redis:6379/0
      - CELERY_RESULT_BACKEND=redis://redis:6379/0
    depends_on:
      - fastapi
      - redis

  redis:
    image: redis:6-alpine
    command: redis-server --appendonly yes
    volumes:
      - redis_data:/data

volumes:
  redis_data:

Till Friday evening, the elb endpoint was working absolutely fine and I could use it. But since today morning, I have suddenly started getting a 502 Bad Gateway error. I had made no changes in the code or the settings on AWS.

The ELB listener settings on AWS:

enter image description here

The target group that is connected to the EC2 instance

enter image description here

When I log into the EC2 instance & check the logs of the docker container that is running the fastapi app, I see the following:

enter image description here

These logs show that the app is starting correctly

I have not configured any health checks specifically. I just have the default settings

enter image description here

Output of netstat -ntlp

enter image description here

I have the logs on the ELB:

http 2022-07-21T06:47:12.458060Z app/dianee-tools-elb/de7eb044e99165db 162.142.125.221:44698 172.31.31.173:443 -1 -1 -1 502 - 41 277 "GET http://18.197.14.70:80/ HTTP/1.1" "-" - - arn:aws:elasticloadbalancing:eu-central-1:xxxxxxxxxx:targetgroup/dianee-tools/da8a30452001c361 "Root=1-62d8f670-711975100c6d9d4038d73544" "-" "-" 0 2022-07-21T06:47:12.457000Z "forward" "-" "-" "172.31.31.173:443" "-" "-" "-"
http 2022-07-21T06:47:12.655734Z app/dianee-tools-elb/de7eb044e99165db 162.142.125.221:43836 172.31.31.173:443 -1 -1 -1 502 - 158 277 "GET http://18.197.14.70:80/ HTTP/1.1" "Mozilla/5.0 (compatible; CensysInspect/1.1; +https://about.censys.io/)" - - arn:aws:elasticloadbalancing:eu-central-1:xxxxxxxxxx:targetgroup/dianee-tools/da8a30452001c361 "Root=1-62d8f670-5ceb74c8530832f859038ef6" "-" "-" 0 2022-07-21T06:47:12.654000Z "forward" "-" "-" "172.31.31.173:443" "-" "-" "-"
http 2022-07-21T06:47:12.949509Z app/dianee-tools-elb/de7eb044e99165db 162.142.125.221:48556 - -1 -1 -1 400 - 0 272 "- http://dianee-tools-elb-yyyyyy.eu-central-1.elb.amazonaws.com:80- -" "-" - - - "-" "-" "-" - 2022-07-21T06:47:12.852000Z "-" "-" "-" "-" "-" "-" "-"
some_programmer
  • 3,268
  • 4
  • 24
  • 59
  • Have you checked if the app is infact running? How have you configured the healthchecks? Have you tried to do a curl on that path/host from inside the EC2? – Riz Jul 18 '22 at 10:40
  • @Riz I updated the question to include the logs that show that the app is running – some_programmer Jul 18 '22 at 10:47
  • Also share the healthcheck settings you have configured ? Also can you do a `netstat -ntlp` in your EC2? – Riz Jul 18 '22 at 10:52
  • "The port the load balancer uses when performing health checks on targets. The default is to use the port on which each target receives traffic from the load balancer." You need to have port 8000 and not 443. Not sure how it was working before if you havn't changed anything. – Riz Jul 18 '22 at 12:42
  • @Riz I have 2 other apps running the same way with the same settings and they are running without any problems – some_programmer Jul 19 '22 at 06:38
  • They also don't have port 443 open in the EC2? Change your port from 443 to 8000 and check. You can make a new target group with different settings. Also increase the timeout and interval in healthcheck in target group and also check for the status code. Some might return a 301,302 too. Also enable access logs for your target group, this way you will know what causes the 502 errors. – Riz Jul 19 '22 at 07:49
  • I see you are using EC2 launch type. I'll suggest ssh into the container and try curling the localhost on port 8080, it should return your application page. After that check the same on the instance as well since you have made the container mapping to port 8080. If this also works, try modifying the target group port to 8080 which is the port on which your application works. If the same setup is working on other resources, it could be you are using redirection. If this doesn't help fetch the full logs using - https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-logs-collector.html – Gurpreet Singh Jul 21 '22 at 18:42
  • @GurpreetSingh I tried `curl -v curl -X 'GET' 'http://0.0.0.0:8080/api/v1/kwept-invest-tool/invest-last-session-data' -H 'accept: application/json'`, but it doesn't work. I get the following error: `curl: (7) Failed to connect to 0.0.0.0 port 8080 after 0 ms: Connection refused`. But when I use 8000 in place of 8080, I get the response – some_programmer Jul 22 '22 at 06:10
  • @GurpreetSingh Were you able to fix it? Most probably, some library got auto-updated if you didn't change anything, and so, it's not working anymore. Could you check the system (EC2), docker, and uvicorn logs? Also, are you sure the request to reaching the EC2 instance, docker, and uvicorn and/or your app? Please confirm. Thanks! – IamAshKS Jul 23 '22 at 06:39
  • 1
    @Junkrat That implies that your application is working on port 8000. You need to modify the target group to perform health check there. Once the Target Group port will change to 8000 the health check should go through. – Gurpreet Singh Jul 27 '22 at 11:45
  • @GurpreetSingh That was the problem. If you can add it in the answer, I will accept it – some_programmer Jul 27 '22 at 17:57

3 Answers3

1

what is "502 Bad Gateway"?

The HyperText Transfer Protocol (HTTP) 502 Bad Gateway server error response code indicates that the server, while acting as a gateway or proxy, received an invalid response from the upstream server.

HTTP protocols

  • http - port number: 80

  • https - port number: 443

From docker-compose.yml file you are exposing port "8000" which will not work.

Possible solutions

  • using NGINX

install the NGINX and add the server config

server {
    listen 80;
    listen 443 ssl;
    # ssl on;
    # ssl_certificate /etc/nginx/ssl/server.crt;
    # ssl_certificate_key /etc/nginx/ssl/server.key;
    # server_name <DOMAIN/IP>;
    location / {
        proxy_pass http://127.0.0.1:8000;
    }
}
  • Changing the port to 80 or 443 in the docker-compose.yml file

My suggestion is to use the nginx.

anjaneyulubatta505
  • 10,713
  • 1
  • 52
  • 62
1

I see you are using EC2 launch type. I'll suggest ssh into the container and try curling the localhost on port 8080, it should return your application page. After that check the same on the instance as well since you have made the container mapping to port 8080. If this also works, try modifying the target group port to 8080 which is the port on which your application works. If the same setup is working on other resources, it could be you are using redirection. If this doesn't help fetch the full logs using - https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-logs-collector.html

If your application is working on port 8000. You need to modify the target group to perform health check there. Once the Target Group port will change to 8000 the health check should go through

Gurpreet Singh
  • 256
  • 1
  • 5
1

Make sure you've set Keep-Alive parameter of you webserver (in your case uvicorn) to something more than the default value of AWS ALB, which is 60s. Doing it this way you will make sure the service doesn’t close the HTTP Keep-Alive connection before the ALB.

For uvicorn it will be: uvicorn app.main:app --host 0.0.0.0 --timeout-keep-alive=65

ambi
  • 1,256
  • 11
  • 16