We encountered some TCP dial timeout issues when the node's (we use Kubernetes) CPU load is high: user received 504 status code after request exceeds 30 seconds, but the server never received those requests.
We use Traefik as our Ingress: forwardingtimeoutsdialtimeout - Traefik, I want to figure out if this caused the problem. How can I simulate a TCP dial timeout error for test?
Timeout requests (time/request_time/upstream_status):
CPU usage: