19
  • Ubuntu 10.04.2
  • nginx 0.7.65

I see some weird HTTP requests coming to my nginx server.

To better understand what is going on, I want to dump whole HTTP request data for such queries. (I.e. dump all request headers and body somewhere I can read them.)

Can I do this with nginx? Alternatively, is there some HTTP server that allows me to do this out of the box, to which I can proxy these requests by the means of nginx?

Update: Note that this box has a bunch of normal traffic, and I would like to avoid capturing all of it on low level (say, with tcpdump) and filtering it out later.

I think it would be much easier to filter good traffic first in a rewrite rule (fortunately I can write one quite easily in this case), and then deal with bogus traffic only.

And I do not want to channel bogus traffic to another box just to be able to capture it there with tcpdump.

Update 2: To give a bit more details, bogus request have parameter named (say) foo in their GET query (the value of the parameter can differ). Good requests are guaranteed not to have this parameter ever.

If I can filter by this in tcpdump or ngrep somehow — no problem, I'll use these.

Alexander Gladysh
  • 2,423
  • 8
  • 31
  • 49

3 Answers3

32

Adjust the number of pre/post lines (-B and -A args) as needed:

tcpdump -n -S -s 0 -A 'tcp dst port 80' | grep -B3 -A10 "GET /url"

This lets you get the HTTP requests you want, on the box, without generating a huge PCAP file that you have to offload somewhere else.

Keep in mind, that the BPF filter is never exact, if there are a large number of packets flowing through any box, BPF can and will drop packets.

oo.
  • 861
  • 6
  • 11
5

I don't know exactly what you mean with dump the request but you can use tcpdump and/or wireshark to analyze the data:

# tcpdump port 80 -s 0 -w capture.cap

And you can use wireshark to open the file and see the conversation between servers.

coredump
  • 12,713
  • 2
  • 36
  • 56
  • Thanks, but I have quite a bit of traffic on this server (99% of it is good), and I think that it would be hard to filter out that bunch of data for that bogus 1% I need. – Alexander Gladysh Mar 13 '11 at 01:28
  • ...if I capture all of it on such low level. :-) – Alexander Gladysh Mar 13 '11 at 01:35
  • I've updated the question to reflect that. – Alexander Gladysh Mar 13 '11 at 01:41
  • Alexander - well that means that 1 out of every 100 requests will have the weird headers you're looking for. Run a capture for a while and then search through the resulting log looking for the headers you want - that's surely not an unbearable amount of work. – EEAA Mar 13 '11 at 01:49
  • The problem is not the work, but the amount of data to process. (It may be bearable though, but, anyway, I'd like to see a more friendly solution.) – Alexander Gladysh Mar 13 '11 at 02:09
  • I don't know a way to capture requests without sniffing the cable. You can add like filters on wireshark (or even on tcpdump). – coredump Mar 13 '11 at 02:30
  • I hate to say this, but thats the solution and quite honestly thats what it takes to fix it. So you'll have to go through that no matter how much work in order to fix your problem. – Jacob Mar 13 '11 at 02:38
  • @Jacob: well here (http://serverfault.com/questions/51409/how-to-dump-entire-http-requests-with-apache) people say that there is ngrep to do filtering. I did not manage to write a filter that works for it yet though... – Alexander Gladysh Mar 13 '11 at 12:16
0

If you proxy the requests to Apache with mod_php installed you can use the following PHP script to dump the requests:

<?php
$pid = getmypid();
$now = date('M d H:i:s');
$fp = fopen('/tmp/intrusion.log', 'a');

if (!function_exists('getallheaders')) 
{ 
    function getallheaders() 
    { 
           $headers = ''; 
       foreach ($_SERVER as $name => $value) 
       { 
           if (substr($name, 0, 5) == 'HTTP_') 
           { 
               $headers[str_replace(' ', '-', ucwords(strtolower(str_replace('_', ' ', substr($name, 5)))))] = $value; 
           } 
       } 
       return $headers; 
    } 
} 

function ulog ($str) {
    global $pid, $now, $fp;
    fwrite($fp, "$now $pid {$_SERVER['REMOTE_ADDR']} $str\n");
}

foreach (getallheaders() as $h => $v) {
    ulog("H $h: $v");
}
foreach ($_GET as $h => $v) {
    ulog("G $h: $v");
}
foreach ($_POST as $h => $v) {
    ulog("P $h: $v");
}
fclose($fp);

Note that since you're using nginx the $_SERVER['REMOTE_ADDR'] may be pointless. You'll have to pass the real IP to Apache via proxy_set_header X-Real-IP $remote_addr;, and you can use that instead (or just rely on it being logged via getallheaders()).

sorin
  • 8,016
  • 24
  • 79
  • 103
hobodave
  • 2,840
  • 2
  • 24
  • 34