0

I am working on integrating AWS App Mesh into our existing Fargate infrastructure and I am running into an issue with the Envoy sidecar container crashing when inbound requests come in.

The specific error I'm getting for inbound requests is (full stack trace below):

[2023-03-20 20:48:10.145][46][critical][assert] [source/common/network/socket_interface_impl.cc:72] assert failure: SOCKET_VALID(result.return_value_). Details: socket(2) failed, got error: Too many open files

I'm new to Envoy & App Mesh, so as I'm working through tweaking Envoy specific config variables I wanted to reach out and see if anyone has a direction for me to go. I tried to run the backtrace tools/stack_decode.py without any luck for the Amazon Envoy image.

With the Amazon Envoy image, I am essentially Closed-box testing these things, and I want to make sure this is setup properly before adding customer configuration. In most cases tweaks are common, but in this case there are so many options I don't want to create any problems before diagnosing the current issues.

Thank you!

Information

Envoy Container: public.ecr.aws/appmesh/aws-appmesh-envoy:v1.25.1.0-prod Fargate Version: 1.4.0

Envoy Sidecar Stack Trace

3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.510][1][warning] [AppNet Agent] [Envoy process 32] Exited with code [-1]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.511][1][warning] [AppNet Agent] [Envoy process 32] Additional Exit data: [Core Dump: false][Normal Exit: false][Process Signalled: true]   envoy-sidecard
3/20/2023, 1:48:10 PM   ActiveStream 0x4ebfb02f200, stream_id_: 12221016937207553230&filter_manager_:   envoy-sidecard
3/20/2023, 1:48:10 PM   FilterManager 0x4ebfb02f280, state_.has_1xx_headers_: 0 envoy-sidecard
3/20/2023, 1:48:10 PM   filter_manager_callbacks_.requestHeaders(): envoy-sidecard
3/20/2023, 1:48:10 PM   ':authority', 'api-frontproxy.aws-dev.example.com'  envoy-sidecard
3/20/2023, 1:48:10 PM   ':path', '/ping'    envoy-sidecard
3/20/2023, 1:48:10 PM   ':method', 'GET'    envoy-sidecard
3/20/2023, 1:48:10 PM   ':scheme', 'http'   envoy-sidecard
3/20/2023, 1:48:10 PM   'x-forwarded-for', '172.17.10.82'   envoy-sidecard
3/20/2023, 1:48:10 PM   'x-forwarded-proto', 'http' envoy-sidecard
3/20/2023, 1:48:10 PM   'x-forwarded-port', '80'    envoy-sidecard
3/20/2023, 1:48:10 PM   'x-amzn-trace-id', 'Root=1-6418c689-400bbf8a1ef4be9070170e55'   envoy-sidecard
3/20/2023, 1:48:10 PM   'pragma', 'no-cache'    envoy-sidecard
3/20/2023, 1:48:10 PM   'cache-control', 'no-cache' envoy-sidecard
3/20/2023, 1:48:10 PM   'upgrade-insecure-requests', '1'    envoy-sidecard
3/20/2023, 1:48:10 PM   'user-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'   envoy-sidecard
3/20/2023, 1:48:10 PM   'accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' envoy-sidecard
3/20/2023, 1:48:10 PM   'accept-encoding', 'gzip, deflate'  envoy-sidecard
3/20/2023, 1:48:10 PM   'accept-language', 'en-US,en;q=0.9' envoy-sidecard
3/20/2023, 1:48:10 PM   'x-envoy-expected-rq-timeout-ms', '60000'   envoy-sidecard
3/20/2023, 1:48:10 PM   'x-app-mesh-virtual-service', 'api-virtual-service' envoy-sidecard
3/20/2023, 1:48:10 PM   'x-app-mesh-match-port', '15000'    envoy-sidecard
3/20/2023, 1:48:10 PM   'x-envoy-original-path', '/ping'    envoy-sidecard
3/20/2023, 1:48:10 PM   'x-envoy-internal', 'true'  envoy-sidecard
3/20/2023, 1:48:10 PM   filter_manager_callbacks_.requestTrailers(): null   envoy-sidecard
3/20/2023, 1:48:10 PM   filter_manager_callbacks_.responseHeaders(): null   envoy-sidecard
3/20/2023, 1:48:10 PM   filter_manager_callbacks_.responseTrailers(): null  envoy-sidecard
3/20/2023, 1:48:10 PM   &streamInfo():  envoy-sidecard
3/20/2023, 1:48:10 PM   StreamInfoImpl 0x4ebfb02f3b8, protocol_: 1, response_code_: null, response_code_details_: null, attempt_count_: 1, health_check_request_: 0, route_name_: upstream_info_:   envoy-sidecard
3/20/2023, 1:48:10 PM   UpstreamInfoImpl 0x4ebfb052010, upstream_connection_id_: null   envoy-sidecard
3/20/2023, 1:48:10 PM   OverridableRemoteConnectionInfoSetterStreamInfo 0x4ebfb02f3b8, remoteAddress(): 172.17.10.82:0, directRemoteAddress(): 127.0.0.1:58104, localAddress(): 127.0.0.1:15000 envoy-sidecard
3/20/2023, 1:48:10 PM   Http1::ConnectionImpl 0x4ebfdba7608, dispatching_: 1, dispatching_slice_already_drained_: 0, reset_stream_called_: 0, handling_upgrade_: 0, deferred_end_stream_headers_: 1, processing_trailers_: 0, buffered_body_.length(): 0, header_parsing_state_: Done, current_header_field_: , current_header_value_:  envoy-sidecard
3/20/2023, 1:48:10 PM   active_request_:    envoy-sidecard
3/20/2023, 1:48:10 PM   , request_url_: null, response_encoder_.local_end_stream_: 0    envoy-sidecard
3/20/2023, 1:48:10 PM   absl::get<RequestHeaderMapPtr>(headers_or_trailers_): null  envoy-sidecard
3/20/2023, 1:48:10 PM   current_dispatching_buffer_ front_slice length: 1124 contents: "GET /ping HTTP/1.1\r\nhost: api-frontproxy.aws-dev.example.com\r\nx-forwarded-for: 172.17.10.82\r\nx-forwarded-proto: http\r\nx-forwarded-port: 80\r\nx-amzn-trace-id: Root=1-6418c689-400bbf8a1ef4be9070170e55\r\npragma: no-cache\r\ncache-control: no-cache\r\nupgrade-insecure-requests: 1\r\nuser-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36\r\naccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9\r\naccept-encoding: gzip, deflate\r\naccept-language: en-US,en;q=0.9\r\ncookie: _ga=GA1.2.368083361.1661187484; _gcl_au=1.1.1031421150.1675887206; AWSALBTG=pq8rTAfHGYVqff7GAlHe6Zy3uwDvkb4MWTUhXecgMU9lwctiXYasZ7fcQwqrr58Y9KU4W0/5EQf+rcsqLspUXYBwWfOMr07pZAmf6zCHYU8hLrL74WQFOfee3HN/C0szgoE/EDKbuOGahYyWiqIcD44cggquJpJUAOfZhpnX4YVWH75nWDY=\r\nx-request-id: ce121a8b-35b3-9ca9-a64e-f553a59be8fb\r\nx-envoy-expected-rq-timeout-ms: 60000\r\nx-app-mesh-virtual-service: api-virtual-service\r\nx-app-mesh-match-port: 15000\r\nx-envoy-original-path: /ping\r\nx-envoy-internal: true\r\n\r\n"    envoy-sidecard
3/20/2023, 1:48:10 PM   ConnectionImpl 0x4ebfbe082e0, connecting_: 0, bind_error_: 0, state(): Open, read_buffer_limit_: 1048576    envoy-sidecard
3/20/2023, 1:48:10 PM   socket_:    envoy-sidecard
3/20/2023, 1:48:10 PM   ListenSocketImpl 0x4ebfdcb7180, transport_protocol_: raw_buffer envoy-sidecard
3/20/2023, 1:48:10 PM   connection_info_provider_:  envoy-sidecard
3/20/2023, 1:48:10 PM   ConnectionInfoSetterImpl 0x4ebfd22fb30, remote_address_: 127.0.0.1:58104, direct_remote_address_: 127.0.0.1:58104, local_address_: 127.0.0.1:15000, server_name_:   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][assert] [source/common/network/socket_interface_impl.cc:72] assert failure: SOCKET_VALID(result.return_value_). Details: socket(2) failed, got error: Too many open files   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:104] Caught Aborted, suspect faulting address 0x1e6100000020    envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):  envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:92] Envoy version: 5deaab910096a3378e7332350a8661d259863e6d/1.24.1-appmesh.0/Modified/RELEASE/BoringSSL envoy-sidecard
3/20/2023, 1:48:10 PM   [symbolize_elf.inc : 1000] RAW: /proc/self/task/32/maps: errno=24   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #0: [0x7f4bf2ae28e0]    envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #1: [0x55ea516e19b4]    envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #2: [0x55ea515dafee]    envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #3: [0x55ea515d7455]    envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #4: [0x55ea515cf36f]    envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #5: [0x55ea515c6408]    envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #6: [0x55ea510bca5e]    envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #7: [0x55ea510bc309]    envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #8: [0x55ea510bd095]    envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #9: [0x55ea5109bd2d]    envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #10: [0x55ea5109bb02]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #11: [0x55ea5109d0dc]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #12: [0x55ea5109d4eb]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #13: [0x55ea510b1c43]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #14: [0x55ea510b4a3d]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #15: [0x55ea510a3712]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #16: [0x55ea5137ddd7]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #17: [0x55ea51397053]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #18: [0x55ea51383c52]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #19: [0x55ea513ff434]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #20: [0x55ea51271858]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #21: [0x55ea512ab396]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #22: [0x55ea512a8002]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #23: [0x55ea512a7d18]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #24: [0x55ea516fd4f7]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #25: [0x55ea512a5f56]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #26: [0x55ea512a57ef]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #27: [0x55ea512ab1c8]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #28: [0x55ea5126d26c]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #29: [0x55ea515dcea0]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #30: [0x55ea515d5dd3]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #31: [0x55ea515d3bfa]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #32: [0x55ea515c922f]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #33: [0x55ea515ca5c3]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #34: [0x55ea516f6f70]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #35: [0x55ea516f58b1]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #36: [0x55ea50d7ed18]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #37: [0x55ea518de9b5]   envoy-sidecard
3/20/2023, 1:48:10 PM   [2023-03-20 20:48:10.145][46][critical][backtrace] [./source/server/backtrace.h:98] #38: [0x7f4bf2ad844b]   envoy-sidecard
3/20/2023, 1:46:39 PM   [2023-03-20 20:46:39.290][1][error] [AppNet Agent] Envoy readiness check failed with: Get "http://127.0.0.1:9901/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) envoy-sidecard
3/20/2023, 1:46:31 PM   [2023-03-20 20:46:31.291][1][error] [AppNet Agent] Envoy connectivity check failed with: Get "http://127.0.0.1:9901/stats?filter=control_plane.connected_state&format=json": context deadline exceeded (Client.Timeout exceeded while awaiting headers) envoy-sidecard
3/20/2023, 1:46:29 PM   [2023-03-20 20:46:29.290][1][error] [AppNet Agent] Envoy readiness check failed with: Get "http://127.0.0.1:9901/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) envoy-sidecard
3/20/2023, 1:46:21 PM   [2023-03-20 20:46:21.291][1][error] [AppNet Agent] Envoy connectivity check failed with: Get "http://127.0.0.1:9901/stats?filter=control_plane.connected_state&format=json": context deadline exceeded (Client.Timeout exceeded while awaiting headers) envoy-sidecard
3/20/2023, 1:46:19 PM   [2023-03-20 20:46:19.290][1][error] [AppNet Agent] Envoy readiness check failed with: Get "http://127.0.0.1:9901/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) envoy-sidecard
3/20/2023, 1:46:09 PM   [2023-03-20 20:46:09.235][1][error] [AppNet Agent] Envoy readiness check failed with: Get "http://127.0.0.1:9901/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) envoy-sidecard
3/20/2023, 1:45:59 PM   [2023-03-20 20:45:59.224][1][error] [AppNet Agent] Envoy readiness check failed with: Get "http://127.0.0.1:9901/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) envoy-sidecard
3/20/2023, 1:45:51 PM   [2023-03-20 20:45:51.214][1][error] [AppNet Agent] Envoy connectivity check failed with: Get "http://127.0.0.1:9901/stats?filter=control_plane.connected_state&format=json": context deadline exceeded (Client.Timeout exceeded while awaiting headers) envoy-sidecard
3/20/2023, 1:45:49 PM   [2023-03-20 20:45:49.213][1][error] [AppNet Agent] Envoy readiness check failed with: Get "http://127.0.0.1:9901/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) envoy-sidecard

Fargate Task Definition

{
    "taskDefinitionArn": "arn:aws:ecs:us-west-2:<account-id>:task-definition/api-family-dev-us-west-2:65",
    "containerDefinitions": [
        {
            "name": "api",
            "image": "private-registry/api:verison",
            "repositoryCredentials": {
                "credentialsParameter": "arn:aws:secretsmanager:us-west-2:<ACCOUNT-ID>:secret:<Secret>"
            },
            "cpu": 2,
            "memoryReservation": 16,
            "portMappings": [
                {
                    "containerPort": 5000,
                    "hostPort": 5000,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "environment": [
                {
                    "name": "APP_ENVIRONMENT",
                    "value": "dev-us-west-2"
                }
            ],
            "mountPoints": [],
            "volumesFrom": [],
            "dependsOn": [
                {
                    "containerName": "api-task-dev-us-west-2-envoy-sidecar",
                    "condition": "HEALTHY"
                }
            ],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "/ecs_logs",
                    "awslogs-region": "us-west-2",
                    "awslogs-stream-prefix": "ecs"
                }
            }
        },
        {
            "name": "envoy-sidecar",
            "image": "private-registry/aws-appmesh-envoy:v1.25.1.0-prod",
            "repositoryCredentials": {
                "credentialsParameter": "arn:aws:secretsmanager:us-west-2:<ACCOUNT-ID>:secret:<Secret>"
            },
            "cpu": 512,
            "memoryReservation": 1024,
            "portMappings": [
                {
                    "containerPort": 15000,
                    "hostPort": 15000,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "environment": [
                {
                    "name": "PID_POLL_INTERVAL_MS",
                    "value": "1000"
                },
                {
                    "name": "APPMESH_RESOURCE_ARN",
                    "value": "arn:aws:appmesh:us-west-2:<ACCOUNT-ID>:mesh/api/virtualNode/api-virtual-node"
                },
                {
                    "name": "APP_ENVIRONMENT",
                    "value": "dev-us-west-2"
                },
                {
                    "name": "APPNET_ENVOY_RESTART_COUNT",
                    "value": "10"
                },
                {
                    "name": "LISTENER_DRAIN_WAIT_TIME_S",
                    "value": "110"
                }
            ],
            "mountPoints": [],
            "volumesFrom": [],
            "user": "1337",
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "/ecs_logs",
                    "awslogs-region": "us-west-2",
                    "awslogs-stream-prefix": "ecs"
                }
            },
            "healthCheck": {
                "command": [
                    "CMD-SHELL",
                    "echo hello"
                ],
                "interval": 5,
                "timeout": 2,
                "retries": 3
            }
        }
    ],
    "family": "api-family-dev-us-west-2",
    "taskRoleArn": "arn:aws:iam::<ACCOUNT-ID>:role/system/Dev-CustomRole",
    "executionRoleArn": "arn:aws:iam::<ACCOUNT-ID>:role/system/Dev-CustomRole",
    "networkMode": "awsvpc",
    "revision": 65,
    "volumes": [],
    "status": "ACTIVE",
    "requiresAttributes": [
        {
            "name": "ecs.capability.execution-role-awslogs"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
        },
        {
            "name": "com.amazonaws.ecs.capability.task-iam-role"
        },
        {
            "name": "ecs.capability.aws-appmesh"
        },
        {
            "name": "ecs.capability.container-health-check"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
        },
        {
            "name": "ecs.capability.task-eni"
        },
        {
            "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.24"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
        },
        {
            "name": "ecs.capability.private-registry-authentication.secretsmanager"
        },
        {
            "name": "ecs.capability.container-ordering"
        }
    ],
    "placementConstraints": [],
    "compatibilities": [
        "EC2",
        "FARGATE"
    ],
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "1024",
    "memory": "2048",
    "proxyConfiguration": {
        "type": "APPMESH",
        "containerName": "api-task-dev-us-west-2-envoy-sidecar",
        "properties": [
            {
                "name": "ProxyIngressPort",
                "value": "15000"
            },
            {
                "name": "AppPorts",
                "value": "5000"
            },
            {
                "name": "EgressIgnoredIPs",
                "value": ""
            },
            {
                "name": "IgnoredUID",
                "value": "1337"
            },
            {
                "name": "ProxyEgressPort",
                "value": "15001"
            }
        ]
    },
    "registeredAt": "2023-03-21T16:39:19.136Z",
    "registeredBy": "arn:aws:sts::<ACCOUNT-ID>:assumed-role/<private>",
    "tags": [
        {
            "key": "name",
            "value": "api-srvc-dev-us-west-2-task"
        }
    ]
}

Testing

I have tried extending the Environment variable timeouts provided by the AWS Envoy Proxy documentation. The Fargate task snippet I provided is the config with max timeouts for all settings.

I have isolated this to being an Envoy setup issue and unrelated to any VPC, EC2, or networking issues. My hypothesis is I am misconfiguring the Envoy proxy somewhere.

Edit 1: Configure Task Definition ulimit

I posted this on Reddit and a user suggested editing the task ulimit to help resolve the too many open files issues.

I have tested with many configurations, but even with the limits I am still getting a too many open files issue.

Here are the various ulimit's I have tried:

... 8192 -> Still getting too many files
"ulimits": [
    {
        "name": "nofile",
        "softLimit": 4096,
        "hardLimit": 4096
    }
]


... 8192 -> Still getting too many files
"ulimits": [
    {
        "name": "nofile",
        "softLimit": 1048576,
        "hardLimit": 1048576
    }
]

... 1048576 -> Envoy never starts initializing and logs:
[info][main] [source/server/drain_manager_impl.cc:171] shutting down parent after drain
"ulimits": [
    {
        "name": "nofile",
        "softLimit": 1048576,
        "hardLimit": 1048576
    }
]

The hardLimit defined by Fargate is 4096 but I wanted to test EC2 limits to see if it would make a difference. Some suggestions online mentioned testing (8192,65536, or higher). Reference ECS post

So far the only difference I have observed is the Envoy not initializing itself with a higher ulimit.

Tony M
  • 682
  • 7
  • 15

0 Answers0