1

I'm tracing the system calls such as read, write, open etc. using ptrace. If there is a system call for open, the arguments passed to this system call can be retrieved from user_regs_struct. First argument is stored in rdi register. Contents of rdi is an unsigned long long int. How do I retrieve string file name from this, eg. foo.txt which was passed to open system call?

Sagar
  • 72
  • 2
  • 11
  • It's a pointer to where that string is located in the process's memory. – Crowman Sep 04 '16 at 03:52
  • Yes but I'm not able to read the same. Is there any special way to read process address space? I'm either getting blank string or . – Sagar Sep 04 '16 at 03:56
  • Yes, using `ptrace`. Look up `PTRACE_PEEKTEXT` and `PTRACE_PEEKDATA`. – Crowman Sep 04 '16 at 04:03
  • The `strace` program does all this and more. I'd examine the source code for it as practical examples of how to get this information. – Craig Estey Sep 04 '16 at 17:52

1 Answers1

0

The arguments passed to open (and any other case a system call receives a string) are pointers to the string's location in memory.

Unfortunately for you, these pointers are for the debugee's address space, and cannot be directly referenced from the debugger's address space.

The ptrace documentation would have you repeatedly call ptrace(PTRACE_PEEKDATA...) to obtain the string, 4 (or 8, depending on platform) bytes at a time. You can see an example of how to do that in the old fakeroot-ng code. Note that that code is under the GPL, so don't copy verbatim unless you are writing code of a compatible license.

Fortunately, there is a method of getting the data that is both simpler and faster. It is less portable, but if you're writing ptrace code, you usually don't care.

The tracer thread can open the file /proc/pid/mem (where pid is, of course, the numerical pid of the debugee). Note that only the ptracer of pid can open that file.

Once that file is open, you can pread(2) from it at the offset of the address you want, and get the data directly from the debugee's address space. Since the read interface isn't limited to four bytes, you can guess at the string's length, read that much bytes in, and then find the terminating NULL and know the precise string.

Since you are invoking less system calls, this method is not only easier, it is also much faster.

Shachar Shemesh
  • 8,193
  • 6
  • 25
  • 57