Nginx reports "upstream connection timeout"

Question

Discription: The k8s nginx-ingress-controllers are exposed in loadbalancer type(implemented by metallb) with ip addr 192.168.1.254. Another nginx cluster is in front of k8s cluster and it has only one upstream which is 192.168.1.254(lb ip addr).The request flow route:client -> nginx clusters -> nginx-ingress-controllers-> services.

Question: Sometimes nginx cluster reports very tiny little "upstream(192.168.1.254) time out" and finally the client got 504 timeout from nginx.

But When I dropped the nginx cluster and switch request flow to : client -> nginx-ingress-controllers-> services.It goes well and client didn't get 504 timeout any more.I am sure the network between nginx cluster and nginx ingress controller works well.

Most of requests can be handled by nginx cluster and return status 200.I have no idea why few requests report "upstream time out" and return status 504.

system architecture

nginx cluster reports timeout

tcpdump package track

Try and set 'worker_processes' to something larger than the default of 1. Best is to use the number of processors the machine has. — OldFart, Jan 22 '21 at 08:24
timeout could also be due to bad POST data which is passed on to upstream without verification. If upstream dies or loops. nginx doesn't know about it and after the default 'proxy_read_timeout' and 'proxy_send_timeout' of 60seconds... it just returns an error to client — OldFart, Jan 22 '21 at 08:27
The nginx ingress controller runs in pod with 2 cores cpu and the nginx.conf shows 2 worker_processes.But I found the core config "net.core.somaxconn" was set to 128.Is it too small? — JONE LEE, Jan 22 '21 at 10:30

anemyte · Accepted Answer · 2021-01-22T08:08:03.030

0

That's most likely slow file uploads (the requests you've showed are all POSTs), something that can't fit into the limit.

You can set a greater timeout value for application paths where uploads can be possible. If you are using and ingress controller you'd better create a separate ingress object for that. You can manage timeouts with these annotations, for example:

  annotations:
    nginx.ingress.kubernetes.io/proxy-send-timeout: 300s
    nginx.ingress.kubernetes.io/proxy-read-timeout: 300s

These two annotations define the maximum upload time to 5 minutes.

If you are configuring nginx manually, you can set limits with proxy_read_timeout and proxy_send_timeout.

edited Jan 22 '21 at 08:08

answered Jan 22 '21 at 07:58

anemyte

17,618
1
24
45

Thanks for answering.The "upstream timed out" still happened after I had set timeout to 300s.The post request should not contain any uploading files but only json data.This happend when I set ingress controller as an upstream behind the nginx cluster(client->nginx->ingress controller).However it didn't get timeout when I dropped the nginx cluster and let client access ingress controller directly(client->ingress controller).Is there something wrong between nginx cluster and ingress controller?I knew nginx will keep connections alive with the upstream.But this time the upstream an LB ip address. – JONE LEE Jan 22 '21 at 10:26
@JONELEE Since you have two nginx in chain both have the timeout option. Did you check this property value on nginx cluster? Is it the same? Can you simulate a long request to find out how long does it take for the timeout to happen? – anemyte Jan 22 '21 at 10:42
1

I found the timeout property in nginx cluster was different from ingress controller in k8s.Now I set it to the same value and keep watching on it.Thanks. – JONE LEE Jan 25 '21 at 01:22

Nginx reports "upstream connection timeout"

1 Answers1

Linked