We have a Wagtail server deployed in one of the 3rd party datacenters. The deployment architecture is very simple and conventional. Proxy (Nginx)--> App Server --> DB. The portal looks good with medium traffic, but when some events happen in the university, there will be a peak load, and the proxy returns Busy Gateway (504).
What confusing us is the fact that the memory and resource utilization of each box is very low even when the proxy times out. We did some vertical scaling but the issue returns when the traffic peaks. proxy now runs on a 16 core machine, app runs on a 32 core machine and db runs on a 16 core machine.
Since we are not seeing any memory issue, we assume that Wagtail implementation is not the culprit. I know that the description given is very minimal but confused about how to start debugging this issue. Any pointers would be helpful. Thank you.