I am running a django app with nginx & uwsgi. Here's how i run uwsgi:
sudo uwsgi -b 25000 --chdir=/www/python/apps/pyapp --module=wsgi:application --env DJANGO_SETTINGS_MODULE=settings --socket=/tmp/pyapp.socket --cheaper=8 --processes=16 --harakiri=10 --max-requests=5000 --vacuum --master --pidfile=/tmp/pyapp-master.pid --uid=220 --gid=499
& nginx configurations:
server {
listen 80;
server_name test.com
root /www/python/apps/pyapp/;
access_log /var/log/nginx/test.com.access.log;
error_log /var/log/nginx/test.com.error.log;
# https://docs.djangoproject.com/en/dev/howto/static-files/#serving-static-files-in-production
location /static/ {
alias /www/python/apps/pyapp/static/;
expires 30d;
}
location /media/ {
alias /www/python/apps/pyapp/media/;
expires 30d;
}
location / {
uwsgi_pass unix:///tmp/pyapp.socket;
include uwsgi_params;
proxy_read_timeout 120;
}
# what to serve if upstream is not available or crashes
#error_page 500 502 503 504 /media/50x.html;
}
Here comes the problem. When doing "ab" (ApacheBenchmark) on the server i get the following results:
nginx version: nginx version: nginx/1.2.6
uwsgi version:1.4.5
Server Software: nginx/1.0.15
Server Hostname: pycms.com
Server Port: 80
Document Path: /api/nodes/mostviewed/8/?format=json
Document Length: 8696 bytes
Concurrency Level: 100
Time taken for tests: 41.232 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 8866000 bytes
HTML transferred: 8696000 bytes
Requests per second: 24.25 [#/sec] (mean)
Time per request: 4123.216 [ms] (mean)
Time per request: 41.232 [ms] (mean, across all concurrent requests)
Transfer rate: 209.99 [Kbytes/sec] received
While running on 500 concurrency level
oncurrency Level: 500
Time taken for tests: 2.175 seconds
Complete requests: 1000
Failed requests: 50
(Connect: 0, Receive: 0, Length: 50, Exceptions: 0)
Write errors: 0
Non-2xx responses: 950
Total transferred: 629200 bytes
HTML transferred: 476300 bytes
Requests per second: 459.81 [#/sec] (mean)
Time per request: 1087.416 [ms] (mean)
Time per request: 2.175 [ms] (mean, across all concurrent requests)
Transfer rate: 282.53 [Kbytes/sec] received
As you can see... all requests on the server fail with either timeout errors or "Client prematurely disconnected" or:
writev(): Broken pipe [proto/uwsgi.c line 124] during GET /api/nodes/mostviewed/9/?format=json
Here's a little bit more about my application: Basically, it's a collection of models that reflect MySQL tables which contain all the content. At the frontend, i have django-rest-framework which serves json content to the clients.
I've installed django-profiling & django debug toolbar to see whats going on. On django-profiling here's what i get when running a single request:
Instance wide RAM usage
Partition of a set of 147315 objects. Total size = 20779408 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 63960 43 5726288 28 5726288 28 str
1 36887 25 3131112 15 8857400 43 tuple
2 2495 2 1500392 7 10357792 50 dict (no owner)
3 615 0 1397160 7 11754952 57 dict of module
4 1371 1 1236432 6 12991384 63 type
5 9974 7 1196880 6 14188264 68 function
6 8974 6 1076880 5 15265144 73 types.CodeType
7 1371 1 1014408 5 16279552 78 dict of type
8 2684 2 340640 2 16620192 80 list
9 382 0 328912 2 16949104 82 dict of class
<607 more rows. Type e.g. '_.more' to view.>
CPU Time for this request
11068 function calls (10158 primitive calls) in 0.064 CPU seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.064 0.064 /usr/lib/python2.6/site-packages/django/views/generic/base.py:44(view)
1 0.000 0.000 0.064 0.064 /usr/lib/python2.6/site-packages/django/views/decorators/csrf.py:76(wrapped_view)
1 0.000 0.000 0.064 0.064 /usr/lib/python2.6/site-packages/rest_framework/views.py:359(dispatch)
1 0.000 0.000 0.064 0.064 /usr/lib/python2.6/site-packages/rest_framework/generics.py:144(get)
1 0.000 0.000 0.064 0.064 /usr/lib/python2.6/site-packages/rest_framework/mixins.py:46(list)
1 0.000 0.000 0.038 0.038 /usr/lib/python2.6/site-packages/rest_framework/serializers.py:348(data)
21/1 0.000 0.000 0.038 0.038 /usr/lib/python2.6/site-packages/rest_framework/serializers.py:273(to_native)
21/1 0.000 0.000 0.038 0.038 /usr/lib/python2.6/site-packages/rest_framework/serializers.py:190(convert_object)
11/1 0.000 0.000 0.036 0.036 /usr/lib/python2.6/site-packages/rest_framework/serializers.py:303(field_to_native)
13/11 0.000 0.000 0.033 0.003 /usr/lib/python2.6/site-packages/django/db/models/query.py:92(__iter__)
3/1 0.000 0.000 0.033 0.033 /usr/lib/python2.6/site-packages/django/db/models/query.py:77(__len__)
4 0.000 0.000 0.030 0.008 /usr/lib/python2.6/site-packages/django/db/models/sql/compiler.py:794(execute_sql)
1 0.000 0.000 0.021 0.021 /usr/lib/python2.6/site-packages/django/views/generic/list.py:33(paginate_queryset)
1 0.000 0.000 0.021 0.021 /usr/lib/python2.6/site-packages/django/core/paginator.py:35(page)
1 0.000 0.000 0.020 0.020 /usr/lib/python2.6/site-packages/django/core/paginator.py:20(validate_number)
3 0.000 0.000 0.020 0.007 /usr/lib/python2.6/site-packages/django/core/paginator.py:57(_get_num_pages)
4 0.000 0.000 0.020 0.005 /usr/lib/python2.6/site-packages/django/core/paginator.py:44(_get_count)
1 0.000 0.000 0.020 0.020 /usr/lib/python2.6/site-packages/django/db/models/query.py:340(count)
1 0.000 0.000 0.020 0.020 /usr/lib/python2.6/site-packages/django/db/models/sql/query.py:394(get_count)
1 0.000 0.000 0.020 0.020 /usr/lib/python2.6/site-packages/django/db/models/query.py:568(_prefetch_related_objects)
1 0.000 0.000 0.020 0.020 /usr/lib/python2.6/site-packages/django/db/models/query.py:1596(prefetch_related_objects)
4 0.000 0.000 0.020 0.005 /usr/lib/python2.6/site-packages/django/db/backends/util.py:36(execute)
1 0.000 0.000 0.020 0.020 /usr/lib/python2.6/site-packages/django/db/models/sql/query.py:340(get_aggregation)
5 0.000 0.000 0.020 0.004 /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py:136(execute)
2 0.000 0.000 0.020 0.010 /usr/lib/python2.6/site-packages/django/db/models/query.py:1748(prefetch_one_level)
4 0.000 0.000 0.020 0.005 /usr/lib/python2.6/site-packages/django/db/backends/mysql/base.py:112(execute)
5 0.000 0.000 0.019 0.004 /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py:316(_query)
60 0.000 0.000 0.018 0.000 /usr/lib/python2.6/site-packages/django/db/models/query.py:231(iterator)
5 0.012 0.002 0.015 0.003 /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py:278(_do_query)
60 0.000 0.000 0.013 0.000 /usr/lib/python2.6/site-packages/django/db/models/sql/compiler.py:751(results_iter)
30 0.000 0.000 0.010 0.000 /usr/lib/python2.6/site-packages/django/db/models/manager.py:115(all)
50 0.000 0.000 0.009 0.000 /usr/lib/python2.6/site-packages/django/db/models/query.py:870(_clone)
51 0.001 0.000 0.009 0.000 /usr/lib/python2.6/site-packages/django/db/models/sql/query.py:235(clone)
4 0.000 0.000 0.009 0.002 /usr/lib/python2.6/site-packages/django/db/backends/__init__.py:302(cursor)
4 0.000 0.000 0.008 0.002 /usr/lib/python2.6/site-packages/django/db/backends/mysql/base.py:361(_cursor)
1 0.000 0.000 0.008 0.008 /usr/lib64/python2.6/site-packages/MySQLdb/__init__.py:78(Connect)
910/208 0.003 0.000 0.008 0.000 /usr/lib64/python2.6/copy.py:144(deepcopy)
22 0.000 0.000 0.007 0.000 /usr/lib/python2.6/site-packages/django/db/models/query.py:619(filter)
22 0.000 0.000 0.007 0.000 /usr/lib/python2.6/site-packages/django/db/models/query.py:633(_filter_or_exclude)
20 0.000 0.000 0.005 0.000 /usr/lib/python2.6/site-packages/django/db/models/fields/related.py:560(get_query_set)
1 0.000 0.000 0.005 0.005 /usr/lib64/python2.6/site-packages/MySQLdb/connections.py:8()
..etc
However, django-debug-toolbar shows the following:
Resource Usage
Resource Value
User CPU time 149.977 msec
System CPU time 119.982 msec
Total CPU time 269.959 msec
Elapsed time 326.291 msec
Context switches 11 voluntary, 40 involuntary
and 5 queries in 27.1 ms
The problem is that "top" shows the load average rising quickly and apache benchmark which i ran both on the local server and from a remote machine within the network shows that i am not serving many requests / second. What is the problem? this is as far as i could reach when profiling the code so it would be appreciated if someone can point of what i am doing here.
Edit (23/02/2013): Adding more details based on Andrew Alcock's answer: The points that require my attention / answer are (3)(3) I've executed "show global variables" on MySQL and found out that MySQL configurations had 151 for max_connections setting which is more than enough to serve the workers i am starting for uwsgi.
(3)(4)(2) The single request i am profiling is the heaviest one. It executes 4 queries according to django-debug-toolbar. What happens is that all queries run in: 3.71, 2.83, 0.88, 4.84 ms respectively.
(4) Here you're referring to memory paging? if so, how could i tell?
(5) On 16 workers, 100 concurrency rate, 1000 requests the load average goes up to ~ 12 I ran the tests on different number of workers (concurrency level is 100):
- 1 worker, load average ~ 1.85, 19 reqs / second, Time per request: 5229.520, 0 non-2xx
- 2 worker, load average ~ 1.5, 19 reqs / second, Time per request: 516.520, 0 non-2xx
- 4 worker, load average ~ 3, 16 reqs / second, Time per request: 5929.921, 0 non-2xx
- 8 worker, load average ~ 5, 18 reqs / second, Time per request: 5301.458, 0 non-2xx
- 16 worker, load average ~ 19, 15 reqs / second, Time per request: 6384.720, 0 non-2xx
AS you can see, the more workers we have, the more load we have on the system. I can see in uwsgi's daemon log that the response time in milliseconds increases when i increase the number of workers.
On 16 workers, running 500 concurrency level requests uwsgi starts loggin the errors:
writev(): Broken pipe [proto/uwsgi.c line 124]
Load goes up to ~ 10 as well. and the tests don't take much time because non-2xx responses are 923 out of 1000 which is why the response here is quite fast as it's almost empty. Which is also a reply to your point #4 in the summary.
Assuming that what i am facing here is an OS latency based on I/O and networking, what is the recommended action to scale this up? new hardware? bigger server?
Thanks