2

I have a Django app running under Apache2 and mod_wsgi and, unfortunately, lots of requests trying to use the server as a proxy. The server is responding OK with 404 errors but the errors are generated by the Django (WSGI) app, which causes a high CPU usage.

If I turn off the app and let Apache handle the response directly (send a 404), the CPU usage drops to almost 0 (mod_proxy is not enabled).

Is there a way to configure Apache to respond directly to this kind of requests with an error before the request hits the WSGI app?

I have seen that maybe mod_security would be an option, but I'd like to know if I can do it without it.

EDIT. I'll explain it a bit more.

In the logs I have lots of connections trying to use the server as a web proxy (e.g. connections like GET http://zzz.zzz/ HTTP/1.1 where zzz.zzz is an external domain, not mine). This requests are passed on to mod_wsgi which then return a 404 (as per my Django app). If I disable the app, as mod_proxy is disabled, Apache returns the error directly. What I'd finally like to do is prevent Apache from passing the request to the WSGI for invalid domains, that is, if the request is a proxy request, directly return the error and not execute the WSGI app.

EDIT2. Here is the apache2 config, using VirtualHosts files in sites-enabled (i have removed email addresses and changed IPs to xxx, change the server alias to sample.sample.xxx). What I'd like is for Apache to reject any request that doesn't go to sample.sample.xxx with and error, that is, accept only relative requests to the server or fully qualified only to the actual ServerAlias.

default:

<VirtualHost *:80>
        ServerAdmin alejandro.mezcua@xxxx.com
        ServerName X.X.X.X
        ServerAlias X.X.X.X

        DocumentRoot /var/www/default
        <Directory />
                Options FollowSymLinks
                AllowOverride None
        </Directory>
        <Directory /var/www/>
                Options FollowSymLinks
                AllowOverride None
                Order allow,deny
                allow from all
        </Directory>

        ErrorDocument 404 "404"
        ErrorDocument 403 "403"
        ErrorDocument 500 "500"
        ErrorLog ${APACHE_LOG_DIR}/error.log

        LogLevel warn

        CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>

actual host:

<VirtualHost *:80>
 ErrorDocument 404 "404"
 ErrorDocument 403 "403"
 ErrorDocument 500 "500"

 WSGIScriptAlias / /var/www/sample.sample.xxx/django.wsgi

 ServerAdmin alejandro.mezcua@xxxx.xxx
 ServerAlias sample.sample.xxx
 ServerName sample.sample.xxx

 CustomLog /var/www/sample.sample.xxx/log/sample.sample.xxx-access.log combined

 Alias /robots.txt /var/www/sample.sample.xxx/static/robots.txt
 Alias /favicon.ico /var/www/sample.sample.xxx/static/favicon.ico

 AliasMatch ^/([^/]*\.css) /var/www/sample.sample.xxx/static/$1

 Alias /static/ /var/www/sample.sample.xxx/static/
 Alias /media/ /var/www/sample.sample.xxx/media/

 <Directory /var/www/sample.sample.xxx/static/>
  Order deny,allow
  Allow from all
 </Directory>

 <Directory /var/www/sample.sample.xxx/media/>
  Order deny,allow
  Allow from all
 </Directory>
</VirtualHost>

EDIT 3. Fixed. The problem was the loading of the Virtual Host files. The attack requests didn't really have the host header set, but the Apache status page was showing it because it was loading the default virtualhost file after the the WSGI app Virtual Host file. The solution was to rename the default virtual host file to 00-default so that apache loads it first. Then all the tips you guys have mentioned have helped to ignore those requests. CPU is back under control!

3 Answers3

1

The simplest course of action I can recommend is to keep mod_proxy disabled and use two different <VirtualHost *:80>...</VirtualHost> sections.

In the first one you will put any ServerName that you like; since it is the first one, Apache will use it for HTTP requests that do not feature a Host: header configured in other VirtualHost sections, like proper proxy requests should be, or requests without a Host: header. It might look like this:

<VirtualHost *:80>
    ServerAdmin webmaster@localhost
    ServerName default
    RedirectMatch gone .*
</VirtualHost>

The second one will be your actual server configuration, more or less exactly like what you have posted, with the correct ServerName and ServerAlias directives.

EDIT: if, as you commented, the Host: header contains your domain, then you may simply want to add

RedirectMatch gone ^http:.*

to your existing vhost. That will do the trick.

pino42
  • 915
  • 5
  • 11
  • For precision's sake: this will refuse "proxy" requests with a `410 Gone` instead of a `404 Not found` status, but I believe it will achieve exactly what you want; in any case it will relieve your app of the unwanted load. You can do the same by, say, specifying an empty directory as DocumentRoot and disabling directory indexes inside the default `VirtualHost`; then you will get a `404 Not found`. – pino42 Nov 05 '12 at 16:44
  • Why would apache use the first vhost for "proxy requests"? This is only true if the Host: request header does not exist or is not provided. – adaptr Nov 05 '12 at 16:49
  • Yes, this does not work when the requests are directed to the vhost, so it is only partially useful. The attacks have the host header set to given domain name. – Alejandro Mezcua Nov 05 '12 at 16:54
  • Because the first vhost is the default one. Looking at the [documentation](http://httpd.apache.org/docs/2.2/vhosts/examples.html): "Due to the fact that www.example.com is first in the configuration file, it has the highest priority and can be seen as the default or primary server. That means that if a request is received that does not match one of the specified ServerName directives, it will be served by this first VirtualHost." So it works both when the `Host:` header is missing, and when it contains something not configured elsewhere. – pino42 Nov 05 '12 at 16:56
  • @AlejandroMezcua: I edited the answer to solve the problem in this case. – pino42 Nov 05 '12 at 16:58
  • You seem to misunderstand the differences between a Host: header, the HTTP protocol version used, a real HTTP proxy, and "proxy-like" behaviour (which is unrelated to any of the others). – adaptr Nov 05 '12 at 17:00
  • @adaptr: thanks; upon re-reading I understood why my wording was wrong. I edited my answer. – pino42 Nov 05 '12 at 17:04
  • But if a request comes with the host header set (e.g. sample.sample.com) and I have a Virtual Host that answers to that, then the default configuration will not match, and apache will use the other virtualhost, which loads the WSGI app, and this will not be of use... – Alejandro Mezcua Nov 05 '12 at 17:23
  • I believe that if you add the `RedirectMatch gone ^http:.*` line to your "real" VirtualHost, as per my latest edit, Apache will issue a `410 Gone` without asking the WSGI application. I haven't tested this, though. – pino42 Nov 05 '12 at 17:50
  • I have edited the question. Your solution works as long as this virtual host file is loaded the first by apache. In my case, because of the file names, this was not the case, and the WSGI virtual host was being loaded first. The host header was not the problem, I was mistaken by what I saw in the status page from apache. – Alejandro Mezcua Nov 05 '12 at 17:54
  • Correct! The fact that order matters becomes trickty when splitting the configuration across different files. Also, glad to know that the attacks are not so nasty as to include your working `ServerName`! – pino42 Nov 05 '12 at 17:58
  • 1
    I mention about default VirtualHost briefly in 'Fallback to default VirtualHost definition.' of http://blog.dscpl.com.au/2012/10/requests-running-in-wrong-django.html May provide some more context for understanding the issue as arises in other ways as well. – Graham Dumpleton Nov 05 '12 at 22:05
0

If by proxying you mean using the CONNECT HTTP method, you can try:

<Limit CONNECT>
  Order allow,deny
  Deny from all
</Limit>

Otherwise you will need to explain what you mean by proxying.

adaptr
  • 16,576
  • 23
  • 34
Graham Dumpleton
  • 6,090
  • 2
  • 21
  • 19
0

You should probably add a dummy default vhost entry that never matches anything, and deny all access to it.

NOTE that whether this will help against these proxy-like attacks depends on the Host: header they supply; if they do not supply a Host: header, the request defaults to HTTP/1.0 and they will hit the first vhost anyway - which would be good.

<VirtualHost *:80>
  ServerName dummy.that.will.never.match
  DocumentRoot /tmp
  Redirect 404 /
  LogLevel crit
  ErrorLog ${APACHE_LOG_DIR}/dummy_error.log
  CustomLog ${APACHE_LOG_DIR}/dummy_access.log "[%t] %h(%a) Host: %{Host}i %H %m %r"
</VirtualHost>

The above CustomLog will provide you with a neat list of offenders.
You don't need the status since you know what it is :)

Also remove the useless ServerAlias from your WSGI host, and move the root Directory definition outside of ANY vhosts - this should be in the main server config, and it MUST deny access to everything and everyone.

<Directory />
  AllowOverride None
  Options None
  Order Allow,Deny
  Deny From All
</Directory>

You can make per-vhost exceptions for symlinks later.

adaptr
  • 16,576
  • 23
  • 34