5

Here are a few samples of strange code I see in our access logs. Can anyone decode this?

For example:

\xb3\xe1\xdd=H\t\xd5\xd2\xf0ml\xf1\x10\xee/\xa0$\xeaY\xa5\xe7\x81d \xd5\x1f\xd9 QI\xd9\'\xfb4I\xb8\xf3\x1d0:\xb5i\x18Q\x02\xa5\x10$\xdd\xcf\xfa\xc2\xfa\x15\xd0\xa8\xa5\xfc\xb2\xda\xb9\x9bA_\x89\xc4~\x0e\x0ebg*>\x18\x12\x9aniA\xf6\xfc\x85%]\x1d\xa6\x16\xfe\x96\x13\xe1\xd8\xb2\xf3i~\xde\xec6\xdbgW\xc3c\xac2\x7f\x9f&\xa5\xce\x14B8~8\xbe\xff1\xa8\xe6\x9a\x9d\xf7 \x14\x10\x9d\xce\xda\x06\x93r\xe7\x86\x98\xa1\x85^\xfa\x93\xf1\x94G\x95\xc0\x1b\xc9\x81\xcb<\x04/\x836E\x85\xbd\xae%\x07D\xe9j\x80\x7f=\xccWW\x04.\xbe\x0f\xb6\x8c

Now, if we leave out all the unreadable characters we get:

=H\tml/$Yd  QI'4I0:iQ$A_~bg*>niA%]i~6gWc2&B8~81 r^G</6E%Dj=WW.

The "H\tml" part in the beginning could suggest that the code above contains some HTML code, or it may just be a coincidence?

Here are a few more samples:

\xbdl\x1cq\x1e\xf65\xe3@3\xd8E\xa8\xf7\xc0e\x10\xfe\x15\xbfzhap\xff\xe6i\x9cq\xe3bGm\x81DWQ\xf5\x94\xbav~\\\xaa\xd0\xed\xdfl\x028\x1d\xcds\x07H\x02\x04\xf2\x8fU\xe0\xd6x,\x9f\x98)\xe8\x1c \xc7\xdd\xd7\xea\xd0\x12h^\xb4\xd0\x85G\xdb\xe4 \xe6\xabYM\xf36\"<\xb6\x1e\xeak]\x93\xc2D\xfa\xc4\xe9\xa93,b\xf5\x80\x15\x92L5\x02\xc3GY\xa7k\x7f\xa2\xfd}\xa2%+\x14\xf5\xe8\x95\x1f\xe2\xef\xd41

st|]%Y\xbf\xeaj\xe9<z\xbb\xfb\xe76\xbbf>\xe9\x1dU{\xaf\x97\x1b\x9e\xf3&\x9b\x87t{\xf3O0\x8c`TQ\xdc\xbd.\xee\xff\x9cEG\xabU\xc5 \xfc[\xe0\x0f\xa5jK\x85\x92\xb2\x90\x96E\xba\x9c\x9c\xa5\xccA`\v\xa0\xd7>3\t\x89u\x11\x817\xa5\xb2\x83\xfa\x89A\x14\x07\xe1\xc4>\"\xb4\x02m\xe4\x9eZ\x9b>\xb0\xe5\x9c\x15\xa0p\xado:\xb4\x1d\x1a\xb7\xb1\x1c\x0f\xa3\xadz-\xdc\xb5q\xb9\xfc\xb95g\xb8\xa8 \xd2t\xa3\x90\xe7N\xa7e \x15I\xe6\x1b\xdbNB5\xfa3\xed\xfdG\t\x19(\xe1\x9f

wo\x01\xb9\x98\xa6q.\x0c&\xba\x1dnXN\xce\xb7\xd3\x99\xfd\x12>*\xa5\x89\xc9\xb2 lQ\x89\xcc\x9f\x113+\xb5\xc4\x86\xb6g\x97\x15]\x98g\xc1\xa1\xa8\xfeK\x03\xb5w\xe4\xf8&\xc8`1\x8c\x1c\x88\x82\xc2]\x8d&\xbc\x8cU&4\xc5[jS \xb0\xed\xf7m{\x95i

\xbdl\x1cq\x1e\xf65\xe3@3\xd8E\xa8\xf7\xc0e\x10\xfe\x15\xbfzhap\xff\xe6i\x9cq\xe3bGm\x81DWQ\xf5\x94\xbav~\\\xaa\xd0\xed\xdfl\x028\x1d\xcds\x07H\x02\x04\xf2\x8fU\xe0\xd6x,\x9f\x98)\xe8\x1c \xc7\xdd\xd7\xea\xd0\x12h^\xb4\xd0\x85G\xdb\xe4 \xe6\xabYM\xf36\"<\xb6\x1e\xeak]\x93\xc2D\xfa\xc4\xe9\xa93,b\xf5\x80\x15\x92L5\x02\xc3GY\xa7k\x7f\xa2\xfd}\xa2%+\x14\xf5\xe8\x95\x1f\xe2\xef\xd41

We see such codes often in the logs. Like millions times a day. Makes me very curious about its contents :))

(more) code also available via http://pastebin.com/ZcXM5NHs

vbence
  • 20,084
  • 9
  • 69
  • 118
schuilr
  • 674
  • 1
  • 7
  • 11
  • You may want to share more info on what system(s) you're running... – bitxwise Mar 05 '11 at 09:32
  • We're running a website of a LAMP platform. But these requests are performed by an external party on our webservers. The codes are sent as URI and the rest of the request is normally encoded (it does GET requests and includes a http host header) – schuilr Mar 05 '11 at 09:40
  • 3
    Stack Overflow does not "malform code". People malform code when they post it without first reading the formatting instructions. Please edit your post and format it correctly, using the code button. – Mark Byers Mar 05 '11 at 09:41
  • Oh pardon me for being a total n00b :) Anyway, its available through Pastebin now – schuilr Mar 05 '11 at 09:43
  • 2
    Just a reminder: if you like answers you can vote them up, or even accept them :) – vbence Mar 05 '11 at 19:11

5 Answers5

5

This is definitely tring to exploit a supposed buffer overflow vulnerability in your server. I guess it is X86 code. You can decode them in php for example:

<?php echo("\xbdl\x1cq\x1e\xf65\xe3@3...");

If you put the output to a file, you can open it in a disassebler, and see the assembler insructions. Alhough I don't think you get any valuable information by looking at them.

These are sweep attacks, there is a little chance for someone tring to attack explicitly your server.

vbence
  • 20,084
  • 9
  • 69
  • 118
  • I was planning to try to disassemble it. Though my ASM skills are somewhat limited :S I don't believe this party is explicitly targeting us, and I find it hard to believe it is exploit code as it is a major party sending these requests (unfortunately I can't disclose their name here) – schuilr Mar 05 '11 at 10:59
  • Also it is important on which byte you start disassembling. It is not sure that the first byte of this sequence is the first byte of the code, try starting it on the 2nd, 3rd etc. bytes too and see when it gives reasoable instructions. – vbence Mar 05 '11 at 11:02
  • 1
    Do the queries look like: `GET \xbdl\x1cq\x1e\xf65 HTTP/1.1` or are they more like `GET /something.dll?\xbdl\x1cq\x1e\xf65`. If the second one is the case then the rest of the query could suggest what kind of API are they trying to communicate with. (This case it can be a legitimate request for your server too). – vbence Mar 05 '11 at 13:15
2

This is for decoding back into binary. (Note: the list of backslash escapes could be incomplete. I just typed in the usual suspects)

#include <stdio.h>
#include <string.h>

int main(void)
{
char buff[2000] ;
size_t len, pos;
int ch;
unsigned val;

while (fgets(buff, sizeof buff, stdin)) {
        len = strlen(buff);
        while (len && buff[len-1] == '\n') buff[--len] = 0;
        for(pos=0; pos < len; pos++) {
                ch = buff[pos];
                if (ch != '\\') { putc( ch, stdout; continue; }
                switch ( ch = buff[++pos] ) {
                case '\\':
                case '\'':
                case '"':  putc(ch,stdout); break;
                case 't':  putc('\t',stdout); break;
                case 'n':  putc('\n',stdout); break;
                case 'r':  putc('\r',stdout); break;
                case 'a':  putc('\a',stdout); break;
                case 'v':  putc('\v',stdout); break;
                case 'b':  putc('\b',stdout); break;
                case ' ':  putc(' ',stdout); break;
                case 'x':
                        ch = buff[++pos];
                        if (ch >= 'a') val = 10 + (ch -'a');
                        else if (ch >= 'A') val = 10 + (ch -'A');
                        else if (ch >= '0') val = (ch -'0');
                        val <<= 4;
                        ch = buff[++pos];
                        if (ch >= 'a') val += 10 + (ch -'a');
                        else if (ch >= 'A') val += 10 + (ch -'A');
                        else if (ch >= '0') val += (ch -'0');
                        putc(val, stdout);
                        break;
                default:
                        putc(ch, stdout);
                        break;
                        }
                }
        }

return 0;
}

The bad news is: the supllied strings don't seem to yield valid x86 code. It may have been crypted, with a decript/bootstrap at the end; near the overflow part. Disclaimer: I am not an assembly expert.

wildplasser
  • 43,142
  • 8
  • 66
  • 109
  • does this code actually enforce `A-F | a-f | 0-9` ? perhaps I'm mis-interpreting the code, but it seems that if something like `x;` appears then it'll end up decoding that as ::::::::::::::: :::::::::::::::: :::::::::::::::::: `';' - '0' ==> 59 - 48 = 11` even though `;` rarely shows up in any base representations. Another tidbit is that if, by any reason, only 1 letter was used , i.e. `0xA` instead of `0x0A` for `\n`, this code might incorrectly up-shifting it by 16 (?) – RARE Kpop Manifesto Jul 06 '23 at 20:56
2

Let's have a look at the first part:

\xb3\xe1\xdd=H\t\xd5\xd2\xf0ml\xf1\x10

The escape codes in the form \xb3 are hexadecimal codes for 8 bit integers. In this case it is the code for 179.

The escape code \t is the tab character.

The "H\t" is just an H (= 72) followed by a tab character (= 9). It is not Ht and is not related to HTML.

I suspect that it is someone sending data to your webserver in an attempt to exploit a vulnerability. You should make sure that your webserver is fully updated to prevent the exploit from working.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
0

My first guess is that \x starts an escape sequence using two hex characters. So try replacing \xAB with the character corresponding to the hex AB.

\t is pobably a tab, and \' an escaped '

CodesInChaos
  • 106,488
  • 23
  • 218
  • 262
0

Trying to reverse engineer binary is a very painful process that is near impossible unless you know what the contents should be in the first place. This is because such files often contain headers that instruct the program that runs the logs on how to decode them. For example - the exact bit where the data starts, and what bit represents what data, and whether the data is float, or double, or int, and what endian format the data is stored in.

You should probably spend your time working out what program wrote the log, and use it to convert it back to ascii - or be able to hunt in some docs for the format of the binary logs

Tom
  • 5,219
  • 2
  • 29
  • 45
  • The application that wrote the log is Apache. It's not our server that encoded data in this way; we receive these codes as request URI to our server. – schuilr Mar 05 '11 at 09:45
  • "Trying to reverse engineer binary is a very painful process that is near impossible unless" - unless you're a skilled reverse enginner. –  Nov 07 '12 at 21:12