How can ASLR be effective?

Question

I've heard the theory. Address Space Location Randomization takes libraries and loads them at randomized locations in the virtual address space, so that in case a hacker finds a hole in your program, he doesn't have a pre-known address to execute a return-to-libc attack against, for example. But after thinking about it for a few seconds, it doesn't make any sense as a defensive measure.

Let's say that our hypothetical TargetLib (libc or anything else the hacker is looking for) is loaded at a randomized address instead of a deterministic one. Now the hacker doesn't know ahead of time where TargetLib and the routines inside it are, but neither does the application code. It needs to have some sort of lookup table somewhere in the binary in order to find the routines inside of TargetLib, and that has to be at a deterministic location. (Or at a random location, pointed to by something else. You can add as many indirections as you want, but eventually you have to start at a known location.)

This means that instead of pointing his attack code at the known location of TargetLib, all the hacker needs to do is point his attack code at the application's lookup table's entry for TargetLib and dereference the pointer to the target routine, and the attack proceeds unimpeded.

Is there something about the way ASLR works that I don't understand? Because as described, I don't see how it's anything more than a speed bump, providing the image of security but no actual substance. Am I missing something?

score 2 · Answer 1 · answered Oct 03 '10 at 21:42

I believe that this is effective because it changes the base address of the shared library. Recall that imported functions from a shared library are patched into your executable image when it is loaded, and therefore there is no table per se, just specific addresses pointing at data and code scattered throughout the program's code.

It raises the bar for an effective attack because it makes a simple buffer overrun (where the return address on the stack can be set) into one where the overrun must contain the code to determine the correct location and then jmp to it. Presumably this just makes it harder.

Virtually all DLLs in Windows are compiled for a base address that they will likely not run at and will be moved anyway, but the core Windows ones tend to have their base address optimized so that the relocation is not needed.

Have you ever debugged a Windows EXE at the ASM level? There's a real import table in there. The loader doesn't patch the code, (all the places where your code may call some external routine,) it patches the import table, which is basically a long sequence of JMP instructions, which the compiler generates CALLs to. — Mason Wheeler, Oct 03 '10 at 21:49
@Mason Wheeler - not for a long while, but that's good to know. While that makes it easier to determine a particular address the net effect is the same isn't it? It changes a known to an unknown, which simply makes the attack more difficult. — Jacob O'Reilly, Oct 03 '10 at 23:13

score 1 · Answer 2 · answered Jul 23 '14 at 08:41

I don't know if get you question correctly but I'll explain when ASLR is effective and when not.

Let's say that we have app.exe and TargetLib.dll. app.exe is using(linked to) TargetLib.dll. To make the explanation simple, let's assume that the virtual address space only has these 2 modules.

If both are ALSR enabled, app.exe's base address is unknown. It may resolve some function call addresses when it is loaded but an attacker knows neither where the function is nor where the resolved variables are. The same thing happens when TargetLib.dll is loaded. Even though app.exe has a lookup table, an attacker does not know where the table is.

Since an attacker cannot tell what is the content of specific address he must attack the application without using any fixed address information. It is usually harder if he uses usual attacking method like stack overflow, heap overflow, use-after-free...

On the other hand, if app.exe is NOT ASLR enabled, it is much easier for an attacker to exploit the application. Because there may be a function call to a interesting API at specific address in app.exe and the attacker can use the address as a target address to jump. (Attacking an application usually starts from jumping to arbitrary address.).

Supplementation: You may already understand it but I want to make one thing clear. When an attacker exploit an application by vulnerability like memory corruption he is usually forced to usefixed address jump instruction. They cannot use relative address jump instruction to exploit. This is the reason why ALSR is really effective to such exploits.

All right. It's the last point that I don't quite get. Why can an attacker use `fixed address jump` (which is one ASM opcode) but not `relative address jump` (which is just another ASM opcode)? Is there some magical difference there that I'm not aware of? — Mason Wheeler, Jul 23 '14 at 13:41
Well, to explain it you first understand how such exploit works. One good example is typical buffer overflow and return address overwrite. I do not explain the detail here but the main point is that what an attacker does is overwriting the `return address` with some fixed value. When execution flow exits from the vulnerable function it jumps to the overwritten address. This jump is always `fixed address jump` not `relative jump`. There may be some vulnerabilities that you can use `relative jump` to exploit but they are rare case. — Shu Suzuki, Jul 23 '14 at 18:19
In other words, what an attacker can do is only corrupting `data` not `instruction`. To exploit an app they corrupt 'data' used by 'fixed address jump' instruction. What an attacker does is extracting shellcode(code they want to execute) and jumping to the shellcode. In the shellcode they can use relative jumps but under ASLR circumstance jumping to shellcode itself is hard problem. — Shu Suzuki, Jul 23 '14 at 19:51

How can ASLR be effective?

2 Answers2