(tl;dr version with a possible solution at the end.)
It's almost certainly due to address space randomization applied to shared libraries. Running the following command a few times will let you see how it works:
$ cat /proc/self/maps
/proc/self/ refers to the current process (the one opening the file). There's are also /proc/<pid>/ directories for specific PIDs. The maps file lists the mappings for the process -- in this case for the cat
process itself.
Here's the output for one run on my system:
00400000-0040c000 r-xp 00000000 08:01 3409248 /bin/cat
0060b000-0060c000 r--p 0000b000 08:01 3409248 /bin/cat
0060c000-0060d000 rw-p 0000c000 08:01 3409248 /bin/cat
0063a000-0065b000 rw-p 00000000 00:00 0 [heap]
7f017ef95000-7f017f761000 r--p 00000000 08:01 8126750 /usr/lib/locale/locale-archive
7f017f761000-7f017f91b000 r-xp 00000000 08:01 11155466 /lib/x86_64-linux-gnu/libc-2.19.so
7f017f91b000-7f017fb1a000 ---p 001ba000 08:01 11155466 /lib/x86_64-linux-gnu/libc-2.19.so
7f017fb1a000-7f017fb1e000 r--p 001b9000 08:01 11155466 /lib/x86_64-linux-gnu/libc-2.19.so
7f017fb1e000-7f017fb20000 rw-p 001bd000 08:01 11155466 /lib/x86_64-linux-gnu/libc-2.19.so
7f017fb20000-7f017fb25000 rw-p 00000000 00:00 0
7f017fb25000-7f017fb48000 r-xp 00000000 08:01 11155454 /lib/x86_64-linux-gnu/ld-2.19.so
7f017fd1c000-7f017fd1f000 rw-p 00000000 00:00 0
7f017fd23000-7f017fd47000 rw-p 00000000 00:00 0
7f017fd47000-7f017fd48000 r--p 00022000 08:01 11155454 /lib/x86_64-linux-gnu/ld-2.19.so
7f017fd48000-7f017fd49000 rw-p 00023000 08:01 11155454 /lib/x86_64-linux-gnu/ld-2.19.so
7f017fd49000-7f017fd4a000 rw-p 00000000 00:00 0
7fffacef5000-7fffacf16000 rw-p 00000000 00:00 0 [stack]
7fffacf5a000-7fffacf5c000 r-xp 00000000 00:00 0 [vdso]
7fffacf5c000-7fffacf5e000 r--p 00000000 00:00 0 [vvar]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
The top three lines are the code segment, read-only data segment, and read-write data segment from your executable. The remaining lines are the stack, heap, various segments for shared libraries, memory-mapped files (as a side note, libraries are just memory-mapped files too), and some internal stuff related to how certain system calls are implemented.
If you repeat the command a few times, you will likely see all mappings except for the code and data segments from the executable move around randomly. This is a security measure. Not knowing where things are in memory makes certain exploits harder to pull off, as you can't just directly jump to some address you know will have a useful routine for example.
The main reason why address space randomization is not applied to the code and data segments from the executable itself is probably efficiency. Code that isn't loaded to a fixed address has to be position-independent, which adds some overhead. This is why shared libraries need to be explicitly compiled with -fPIC
.
(Shared libraries also need to be position-independent for other reasons besides security. Using a fixed address for each library would cause problems if two libraries happened to get overlapping load addresses.)
Unfortunately I'm not familiar with PinTool. I believe GDB simply disables address space randomization (using the personality(2)
system call) to get predictable addresses for shared libraries.
Address space randomization can be turned off for a single shell session (this seems to use personality()
under the hood too), or globally by doing echo 0 > /proc/sys/kernel/randomize_va_space
(see the /proc/sys/ documentation).
I found the following on this page. Might be relevant.
Does Pin change the application code and data addresses?
...
Note: Recent linux kernels intentionally move the location of stack and dynamically allocated data from run to run, even if you are not using pin. On RedHat-based systems you can workaround this by running Pin as follows:
$ setarch i386 pin -t pintool -- app
tl;dr answer
If all you need to do is to correlate addresses from PinTool that happen to come from libraries to objdump
disassembly addresses, and you don't mind doing some manual work each time, then the following should work:
Print /proc/maps from your process. (You could also run it in the background and print /proc/<pid>/maps from the shell, using e.g. $!
to get the PID.)
Check which mapping the address belongs to. In the library case, it'll probably be the text segment of some library (marked r-xp
in the /proc/maps).
Subtract the start address of the mapping from the address you see in PinTool.
This will get you the addresses you see in an objdump
disassembly (when you run it on that same library). You could use addr2line(1)
to get the source line too if the library has debugging information.
There might be nicer workflows of course. This worked for me when playing around a bit with dlopen(3)
and dlsym(3)
at least. Core dumps should contain library load addresses, so maybe that could be used somehow...