Hi I am using AWS EC2 instance (c5.18xlarge (72vcpu and 144 Gib ram)) with Nginx + Django and postgresql. The problem is it gets to 100% CPU utilisation in 400-500 concurrent users and slows down. Here are my settings:
postgresql.conf
max_connections = 800
shared_buffers = 50000MB
Nginx.conf
user www-data;
worker_processes 128;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
worker_rlimit_nofile 50000;
events {
worker_connections 10000;
#multi_accept on;
}
sysctl.conf
net.core.somaxconn = 65000
net.core.netdev_max_backlog = 65535
fs.file-max = 90000
net.ipv4.ip_forward = 1
kernel.shmmax=1006632960
gunicorn_start.bash
NAME="django_app" # Name of the application
DJANGODIR=/home/ubuntu/project # Django project directory
SOCKFILE=/home/ubuntu/env/run/gunicorn.sock # we will communicte using this unix socket
USER=ubuntu # the user to run as
GROUP=ubuntu # the group to run as
NUM_WORKERS=128 # how many worker processes should Gunicorn spaw
TIMEOUT=120
DJANGO_SETTINGS_MODULE=project.settings # which settings file should Django use
DJANGO_WSGI_MODULE=project.wsgi # WSGI module name
echo "Starting $NAME as `whoami`"
I know the server can't overload on the current usage of the app. Please suggest how can this be solved. Or how can it be debugged to reach the problem.