0

As you known, it is possible catch any signal but kill and stop/count with an handler.
There’s three kind of invalid address access :

  • The attempt to execute/jump at an invalid address.
  • The attempt to read at an invalid address.
  • The attempt to write at an invalid address.

I’m only interested in rejecting invalid read accesses. So the idea is to catch all segmention faults and abort() if it’s not an invalid read access.

So far, I only know how to use SEGV_MAPERR and SEGV_ACCERR with sigaction which is irrelevant of course.

user2284570
  • 2,891
  • 3
  • 26
  • 74
  • @NominalAnimal **Wrong.** As stated in man pages, it contains the address which caused the segfault whether it is a page read or a page write *(or an attempt to execute an invalid address)*. Just try yourself in gdb 7 and it will quicly show you it s value can point to address that never contained anything, but the address of the invalid access attempt *(notice 0x0 does not contains instructions in normal circumstances)*. – user2284570 Mar 27 '17 at 00:04
  • Good catch. Let me rephrase that: The `si_addr` field of the `siginfo_t` structure is basically useless for this. You need to look at the context (`ucontext_t`, the third parameter of a `SA_SIGINFO` handler), and depending on the hardware architecture, the `uc_mcontext.gregs` field in it, to find out the address of the instruction (`rip` field on x86-64), then decode the instruction at that address, to determine the exact cause. Note that you cannot just use any old instruction decoding library for this; it must be async-signal-safe (to be safe to use in a signal handler). – Nominal Animal Mar 27 '17 at 00:12
  • @NominalAnimal yes, and as my use case is to detect dlmalloc metadata corruption, such library would need to work from stack only. As I am using the full x86_64 instruction set on a recent porocessor, do you mean i will need to create something able to decode more than 200 opcodes from scratch? – user2284570 Mar 27 '17 at 00:19
  • Not from scratch, I hope; any table-driven one should work. I've actually looked into this quite in-depth years ago (that's why my mistake wrt. `si_addr` vs. `info->uc_mcontext.gregs[REG_RIP]`/`info->uc_mcontext.gregs[REG_EIP]`), to simulate instructions that access specific pages (`SIGBUS` handler, skips over the simulated instruction), and it seemed then that the extra information needed in relation to each instruction would be simplest to access by .. writing my own table-driven approach. Not worth the effort in the end. – Nominal Animal Mar 27 '17 at 00:45
  • @NominalAnimal I failed to find the requiment and even the table driven disassembler of chromium use malloc for processing operand registers. – user2284570 Mar 27 '17 at 00:49
  • If by requirement you mean why only async-signal safe functions should be used, look at [man 7 signal](http://man7.org/linux/man-pages/man7/signal.7.html). Anyway, I don't think I looked at the Linux kernel one ([here](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/lib/) for x86) at the time. A quick glance makes me think it might be adaptable for this, although it would be quite a lot of work. – Nominal Animal Mar 27 '17 at 01:00
  • @NominalAnimal no I mean the required library. I forgot one word. it looks like writing a suitable disassembler from scratch is required. – user2284570 Mar 27 '17 at 01:01
  • Another option is to put the instruction decoding and analysis in a separate program so you're not limited to async-signal safe functions (`send()` and `recv()` are async-signal safe), and use a (pre-connected) socket to send the information in a binary blob (say, 16 bytes of code, plus all register state from the mcontext_t), and receive the necessary information in another. Even then, it is quite a lot of work. You might look at [Intel XED](https://intelxed.github.io/) for example, if you go this route. – Nominal Animal Mar 27 '17 at 01:03
  • @NominalAnimal the problem is I am doing this a part of fuzzing. Anything out of process will aad a signifidant overhead that will decrease by at least 10 the number of test cases per second ratio *(this what I experienced recently by calling an external utility)*. If I an invalid write or jump access then I have a proof of concept and fuzzing needs to be stopped. – user2284570 Mar 27 '17 at 01:14
  • Whenever you get a `SIGSEGV` (or `SIGBUS`), you already have found an error. It is not like you can ignore either signal; the kernel will simply reraise the same signal for the same address again (because it re-executes the offending instruction; unless you decode and update RIP to skip the instruction). Forking and executing an external program does have significant overhead (especially latency), but using an UNIX domain socket should keep overhead and latencies very low. I am afraid all working solutions to your problem do involve a lot of code, including writing quite a bit of new code. – Nominal Animal Mar 27 '17 at 04:23
  • @NominalAnimal : no, I simply start over the program from a previous `clone()`. I share a lot of data. I confirm switching to gdb scripting trigger a large overhead. I just though to a second thing, wouldn’t`uc_mcontext.gregs`point to a non valid memory area in the case of an invalid jump ? Also, what about a normal direct branch to the instruction that caused the segfault ? *(remember there are instructions that both read and write)*. Looks like your suggestion to work around this would debugging symbols analysis, or full disassembly of executable areas in memory. – user2284570 Mar 27 '17 at 11:43
  • I don't see why you'd need gdb here? No, I just meant that to decode the 1-15 bytes for the violating instruction (plus register state), you could use a socket connection in the signal handler to an "instruction decode server", which responds with the memory access type and address, to determine the correct action and logging. The context is constructed by the kernel and stored in the process' stack, or the alternate stack (set by `sigaltstack()`) if `sa_flags` contained `SA_ONSTACK` when installing the SIGSEGV/SIGBUS handler, so there should not be any problems there. – Nominal Animal Mar 27 '17 at 11:52
  • Ok, I start understand now, but do you mean the instruction details is sent for disassembly only once a SIGSEGV is caught ? In that case, what if the error was caused after an`munmap`system call that unmaped the glibc executable *(exec after free)*. – user2284570 Mar 27 '17 at 12:42
  • Since you only need a blocking `send()` and `recv()` in the signal handler, and you are restricted to x86-64 architecture on Linux, you can open-code the syscall wrappers in inline assembly, making the signal handler code self-contained. I shall write an answer outlining the things in this comment chain, explaining my suggestions (and its limitations). – Nominal Animal Mar 27 '17 at 13:46
  • @NominalAnimal : In the saim vein, what about writing a kernel module since the mmu have the information I’m looking for ? Doesn’t looks so much more complex than you proposal. – user2284570 Mar 27 '17 at 13:51
  • Actually, how about using `objdump -d`, and filtering the output, to classify each instruction in the code beforehand, and compile that in to the process? It shouldn't be that hard to write an e.g. awk script for this purpose. It won't be a generic instruction decoder; it'll just know the values of `rip` that cause the crash to be handled one way, and all others some other way. There'll be a LOT of readonly data (it's rather easy to pack, though), but that should not be a problem. – Nominal Animal Mar 27 '17 at 17:50
  • @NominalAnimal : then such thing would to be done on the >200 dependency shared objects. – user2284570 Mar 27 '17 at 17:54
  • "As you known, it is possible catch any signal but kill with an handler." This is incorrect. SIGSTOP also cannot be caught, and there may be others, depending on what extensions beyond the standard POSIX core signals your system employs... – twalberg Mar 27 '17 at 21:02
  • From a comment by OP, @user2284570, I did find that the `uc_mcontext.gregs[REG_ERR]` field in the `ucontext_t` provided to the signal handler contains the page fault error code bits (as documented in [`arch/x86/mm/fault.c` in the Linux kernel](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/mm/fault.c)). Thus, one possible solution is quite straightforward (full example shown in my replacement answer). – Nominal Animal Mar 27 '17 at 21:57

1 Answers1

2

It turns out that in Linux on x86-64 (aka AMD64) architecture, this is in fact quite feasible.

Here is an example program, crasher.c:

#define  _POSIX_C_SOURCE 200809L
#define  _GNU_SOURCE
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <ucontext.h>
#include <signal.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>

#if !defined(__linux__) || !defined(__x86_64__)
#error This example only works in Linux on x86-64.
#endif

#define  ALTSTACK_SIZE  262144

static const char hex_digit[16] = {
    '0', '1', '2', '3', '4', '5', '6', '7',
    '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'
};

static inline const char *signal_name(const int signum)
{
    switch (signum) {
    case SIGSEGV: return "SIGSEGV";
    case SIGBUS:  return "SIGBUS";
    case SIGILL:  return "SIGILL";
    case SIGFPE:  return "SIGFPE";
    case SIGTRAP: return "SIGTRAP";
    default:      return "(unknown)";
    }
}

static inline ssize_t internal_write(int fd, const void *buf, size_t len)
{
    ssize_t retval;
    asm volatile ( "syscall\n\t"
                 : "=a" (retval)
                 : "a" (1), "D" (fd), "S" (buf), "d" (len)
                 : "rcx", "r11" );
    return retval;
}

static inline int wrerr(const char *p, const char *q)
{
    while (p < q) {
        ssize_t n = internal_write(STDERR_FILENO, p, (size_t)(q - p));
        if (n > 0)
            p += n;
        else
        if (n == 0)
            return EIO;
        else
            return -n;
    }
    return 0;
}

static inline int wrs(const char *p)
{
    if (p) {
        const char *q = p;
        while (*q)
            q++;
        return wrerr(p, q);
    }
    return 0;
}

static inline int wrh(unsigned long h)
{
    static char buffer[4 + 2 * sizeof h];
    char       *p = buffer + sizeof buffer;

    do {
        *(--p) = hex_digit[h & 15];
        h /= 16UL;
    } while (h);

    *(--p) = 'x';
    *(--p) = '0';

    return wrerr(p, buffer + sizeof buffer);
}

static void crash_handler(int signum, siginfo_t *info, void *contextptr)
{
    if (info) {
        ucontext_t *const ctx = (ucontext_t *const)contextptr;
        wrs(signal_name(signum));
        if (ctx->uc_mcontext.gregs[REG_ERR] & 16) {
            const unsigned long sp = ctx->uc_mcontext.gregs[REG_RSP];
            /* Instruction fetch */
            wrs(": Bad jump to ");
            wrh((unsigned long)(info->si_addr));
            if (sp && !(sp & 7)) {
                wrs(" probably by the instruction just before ");
                wrh(*(unsigned long *)sp);
            }
            wrs(".\n");
        } else
        if (ctx->uc_mcontext.gregs[REG_ERR] & 2) {
            /* Write access */
            wrs(": Invalid write attempt to ");
            wrh((unsigned long)(info->si_addr));
            wrs(" by instruction at ");
            wrh(ctx->uc_mcontext.gregs[REG_RIP]);
            wrs(".\n");
        } else {
            /* Read access */
            wrs(": Invalid read attempt from ");
            wrh((unsigned long)(info->si_addr));
            wrs(" by instruction at ");
            wrh(ctx->uc_mcontext.gregs[REG_RIP]);
            wrs(".\n");
        }
    }

    raise(SIGKILL);
}

static int install_crash_handler(void)
{
    stack_t           altstack;
    struct sigaction  act;

    altstack.ss_size = ALTSTACK_SIZE;
    altstack.ss_flags = 0;
    altstack.ss_sp = mmap(NULL, altstack.ss_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_GROWSDOWN, -1, 0);
    if (altstack.ss_sp == MAP_FAILED) {
        const int retval = errno;
        fprintf(stderr, "Cannot map memory for alternate stack: %s.\n", strerror(retval));
        return retval;
    }
    if (sigaltstack(&altstack, NULL)) {
        const int retval = errno;
        fprintf(stderr, "Cannot use alternate signal stack: %s.\n", strerror(retval));
        return retval;
    }

    memset(&act, 0, sizeof act);
    sigemptyset(&act.sa_mask);
    act.sa_flags = SA_SIGINFO | SA_ONSTACK;
    act.sa_sigaction = crash_handler;
    if (sigaction(SIGSEGV, &act, NULL) == -1 ||
        sigaction(SIGBUS,  &act, NULL) == -1 ||
        sigaction(SIGILL,  &act, NULL) == -1 ||
        sigaction(SIGFPE,  &act, NULL) == -1) {
        const int retval = errno;
        fprintf(stderr, "Cannot install crash signal handlers: %s.\n", strerror(retval));
        return retval;
    }

    return 0;
}

int main(int argc, char *argv[])
{
    void         (*jump)(void) = 0;
    unsigned char *addr = (unsigned char *)0;

    if (argc < 2 || argc > 3 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
        fprintf(stderr, "\n");
        fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
        fprintf(stderr, "       %s call [ address ]\n", argv[0]);
        fprintf(stderr, "       %s read [ address ]\n", argv[0]);
        fprintf(stderr, "       %s write [ address ]\n", argv[0]);
        fprintf(stderr, "\n");
        return EXIT_SUCCESS;
    }
    if (argc > 2 && argv[2][0] != '\0') {
        char          *end = NULL;
        unsigned long  val;

        errno = 0;
        val = strtoul(argv[2], &end, 0);
        if (errno) {
            fprintf(stderr, "%s: %s.\n", argv[2], strerror(errno));
            return EXIT_FAILURE;
        }
        if (end)
            while (*end == '\t' || *end == '\n' || *end == '\v' ||
                   *end == '\f' || *end == '\r' || *end == ' ')
                end++;
        if (!end || end <= argv[2] || *end) {
            fprintf(stderr, "%s: Not a valid address.\n", argv[2]);
            return EXIT_FAILURE;
        }

        jump = (void *)val;
        addr = (void *)val;
    }

    if (install_crash_handler())
        return EXIT_FAILURE;

    if (argv[1][0] == 'c' || argv[1][0] == 'C') {
        printf("Calling address %p: ", (void *)jump);
        fflush(stdout);
        jump();
        printf("Done.\n");

    } else
    if (argv[1][0] == 'r' || argv[1][0] == 'R') {
        unsigned char  val;

        printf("Reading from address %p: ", (void *)addr);
        fflush(stdout);
        val = *addr;
        printf("0x%02x, done.\n", val);

    } else
    if (argv[1][0] == 'w' || argv[1][1] == 'W') {
        printf("Writing 0xC4 to address %p: ", (void *)addr);
        fflush(stdout);
        *addr = 0xC4;
        printf("Done.\n");
    }

    printf("No crash.\n");
    return EXIT_SUCCESS;
}

Compile it using e.g.

gcc -Wall -O2 crasher.c -o crasher

You can test a call, a read, or a write to an arbitrary address by specifying the operation and optionally the address on the command line. Run without parameters to see the usage.

Some example runs on my machine:

./crasher call 0x100
Calling address 0x100: SIGSEGV: Bad jump to 0x100 probably by the instruction just before 0x400c4e.
Killed

./crasher write 0x24
Writing 0xC4 to address 0x24: SIGSEGV: Invalid write attempt to 0x24 by instruction at 0x400bad.
Killed

./crasher read 0x16
Reading from address 0x16: SIGSEGV: Invalid read attempt from 0x16 by instruction at 0x400ca3.
Killed

./crasher write 0x400ca3
Writing 0xC4 to address 0x400ca3: SIGSEGV: Invalid write attempt to 0x400ca3 by instruction at 0x400bad.
Killed

./crasher read 0x400ca3
Reading from address 0x400ca3: 0x41, done.
No crash.

Note that the type of the access is obtained from the ((ucontext_t *)contextptr)->uc_mcontext.gregs[REG_ERR] register (from the signal handler context); it matches the x86_pf_error_code enums as defined in arch/x86/mm/fault.c in the Linux kernel sources.

The crash handler itself is quite straightforward, only needing to exmine the aforementioned "register" to obtain the information the OP seeks.

For outputting the crash report, I open-coded the write() syscall. (For some reason, the small buffer needed by the wrh() function cannot be on the stack, so I just made it static instead.)

I did not bother to implement the mincore() syscall to verify for example the stack address (sp in the crash_handler() function); it might be necessary to avoid double faults (SIGSEGV occurring in the crash_handler() itself).

Similarly, I didn't bother to open-code the raise() at the end of crash_handler(), because nowadays on x86-64 it is implemented in the C library using the tgkill(pid, tid, signum) syscall, which means I'd also had to open-code the getpid() and gettid() syscalls. I was just lazy.

Finally, the above code is written quite carelessly, as I myself only found this after exchanging comments with the OP, user2284570, and just wanted to throw something together to see if this approach actually works reliably. (It seems it does, but I've only tested this lightly and only on one machine.) So, if you notice any bugs, typos, thinkos, or other things to fix in the code, please let me know in a comment, so I can fix it.

Nominal Animal
  • 38,216
  • 5
  • 59
  • 86
  • The compiler complains about `REG_ERR` being not defined. – user2284570 Mar 28 '17 at 17:59
  • @user2284570: You mean, with the above code? (Obviously it works for me; `REG_ERR == 19`.) It is provided by [glibc:sysdeps/unix/sysv/linux/x86/sys/ucontext.h](https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/x86/sys/ucontext.h;hb=HEAD); in userspace, you need to define `_GNU_SOURCE` before including `` (which includes ``, the file I just linked to). Which C library (version) are you using? – Nominal Animal Mar 29 '17 at 07:08
  • And would it be possible to known the data that would have been written in the case of an invalid write ? – user2284570 Mar 30 '17 at 11:01
  • @user2284570: I ended my old experiments at that exact point. To find out the data that would have been written, we'd need to (partially) disassemble the opcode at instruction pointer (REG_RIP on x86-64). In my case, I would have needed the instruction length (so that it could be skipped by advancing instruction pointer) and the register (including register width, so on x86-64, al/ah/ax/eax/rax and so on for every possible register), or the immediate value if the instruction stores an immediate. In your case, you only need the register/immediate. – Nominal Animal Mar 31 '17 at 05:41
  • Is it possible to get the `ucontext_t` struct with the `PTRACE_SIGINFO` method in the [`ptrace`](http://man7.org/linux/man-pages/man2/ptrace.2.html) system call when tracing another process? – Ajay Brahmakshatriya Jul 25 '19 at 05:27