I'm using envoy proxy as a reverse proxy in a container for my loadbalanced application. We recently had a network outage which caused envoy proxy to return 503 to our downstream clients. Normally, when the network outage recovers, envoy proxy will be able to connect to our upstream servers properly again and everyone is happy.
However, after the network outage recovers, the downstream clients are still receiving HTTP 503 with LR response flag unless I restart the containers. Am I missing something? Can someone point me to the right direction on how should I resolve this issue?
admin:
access_log_path: /tmp/admin_access.log
address:
socket_address:
protocol: TCP
address: 0.0.0.0
port_value: 9901
static_resources:
listeners:
- name: listener_0
address:
socket_address:
protocol: TCP
address: 0.0.0.0
port_value: 10000
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
stat_prefix: ingress_http
server_name: none
route_config:
name: local_route
response_headers_to_remove:
- x-envoy-upstream-service-time
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match:
prefix: "/"
route:
host_rewrite: api.example.com
cluster: service_api
http_filters:
- name: envoy.filters.http.router
access_log:
- name: envoy.access_loggers.file
filter:
not_health_check_filter: {}
config:
path: "/dev/stdout"
clusters:
- name: service_api
connect_timeout: 30s
type: LOGICAL_DNS
dns_resolvers:
socket_address:
address: 127.0.0.1
port_value: 53
# Comment out the following line to test on v6 networks
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
upstream_connection_options:
tcp_keepalive: {}
load_assignment:
cluster_name: service_api
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: api.example.com
port_value: 443
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
Any pointers will be much appreciated.