I am running a Django web app on an EC2 server using Nginx, and uWSGI. I also have Celery running some background tasks (no CRON jobs, just on occasional user actions).
The app is in early closed Beta with no users currently active.
Over the past three days, the server would fall over after experiencing super high CPU loads, seemingly randomly (see screengrab).
Before this, the app was running without issue for weeks. I made some programmatic changes to the website, but not to the server configuration (consolidating models mostly).
I tried to pick something up from the logs (Nginx access.log, error.log and Django debug.log), but I don't see any errors or oddities (don't have access to the logs right now).
In addition, I experienced a similar effect when migrating model changes (in venv) if I haven't restarted the server beforehand. Sometimes, even when restarting the server, it would become so slow I would have to wait several minutes for Celery to restart.
I need help to find a starting point to investigate the problem. Any ideas?