23

When the program launches (Linux ELF executable), are there zeros in eax, ebx, etc. or can there be anything?

(I'm not doing any calls or using extern libraries).

On my machine the registers are zeroed, but can I rely on such behavior in a new process when writing asm programs?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 2
    Under normal circumstances, you would initialize these explicitly. Therefore, it shouldn't matter what their initial state is. – Brent Bradburn Feb 05 '12 at 07:36
  • I'm trying to save few bytes in my executable and i don't want move zeros to eax if i can avoid this. I need to put 1 to eax, if I use movl command it'l be translated into 5 bytes. So i want to use movb and put 1 to al. Result is the same, the command translated into 2 bytes. I get 3 byte gain. But it can be done only if there are zeros in eax by default. –  Feb 05 '12 at 07:50
  • 1
    Under what circumstances would saving this amount of code matter? Just initialize them. If the top bits of EAX don't matter, then you can initialize it with movb al,1, but don't worry about the space. – Ira Baxter May 15 '12 at 10:23
  • 3
    Demo scene, for example ) I know about movb al, 1 or something like xor - but it's a few bytes in op-codes - so, no, if I can avoid this - I'll do so. –  May 15 '12 at 17:31
  • 1
    Similar question for ARM: http://stackoverflow.com/questions/1802783/initial-state-of-program-registers-and-stack-on-linux-arm – Ciro Santilli OurBigBook.com Apr 20 '15 at 19:48
  • 1
    just like you should never expect an uninitialized variable to be zero you should never expect registers to be in some state before using, nor ram to be in some state. except for well defined passed parameters you should never read something before writing to it. – old_timer Feb 17 '16 at 15:45
  • Semi-related for Windows: [CPU registers state on the very start of the app. PE executables](https://stackoverflow.com/q/28730356) – Peter Cordes Apr 07 '22 at 04:25

3 Answers3

24

This depends entirely on the ABI for each platform. Since you mention eax and ebx let's see what's the case for x86 (as of Linux v5.17.5). In fs/binfmt_elf.c, inside load_elf_binary(), the kernel checks if the ABI specifies any requirements for register values at program loading:

/*
 * The ABI may specify that certain registers be set up in special
 * ways (on i386 %edx is the address of a DT_FINI function, for
 * example.  In addition, it may also specify (eg, PowerPC64 ELF)
 * that the e_entry field is the address of the function descriptor
 * for the startup routine, rather than the address of the startup
 * routine itself.  This macro performs whatever initialization to
 * the regs structure is required as well as any relocations to the
 * function descriptor entries when executing dynamically links apps.
 */

It then calls ELF_PLAT_INIT, which is a macro defined for each architecture in arch/xxx/include/elf.h. For x86, it does the following:

#define ELF_PLAT_INIT(_r, load_addr)        \
    do {                                    \
        _r->bx = 0; _r->cx = 0; _r->dx = 0; \
        _r->si = 0; _r->di = 0; _r->bp = 0; \
        _r->ax = 0;                         \
    } while (0)

So, when your statically-linked ELF binary is loaded on Linux x86, you could count on all register values being equal to zero. Doesn't mean you should, though. :-)


Dynamic linking

Note that executing a dynamically linked binary actually runs dynamic linker code in your process before execution reaches your _start (ELF entry point). This can and does leave garbage in registers, as allowed by the ABI. Except of course for the stack pointer ESP/RSP and atexit hook EDX/RDX.

Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
Michael Foukarakis
  • 39,737
  • 6
  • 87
  • 123
  • 3
    This zeroing still happens (2015), but is NOT required by the ABI. (See Ciro's comment on Basile's answer). – Peter Cordes Jul 12 '15 at 05:02
  • 1
    Also note that in a dynamically-linked executable, the dynamic linker runs before `_start`, and does leave garbage in registers as allowed by the ABI. Only statically-linked executables have their registers zeroed when execution reaches `_start`. – Peter Cordes Sep 18 '17 at 07:17
  • 1
    More specifically: setting all registers (except for %esp) to 0 was introduced in Linux 2.2.0 (see also https://asm.sourceforge.net/articles/startup.html). Before that, only %edx was set to 0 (or to atexit) explicitly in `ELF_PLAT_INIT`, and also %eax implicitly elsewhere. – pts Dec 19 '22 at 13:42
12

For AMD64 or x86-64 systems (64 bits) on Linux, the x86-64 ABI defines the initial content of registers.

There are similar specifications for i386 ABI, ARM ABI etc.

See wikipedia pages on ELF and ABI

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
10

x86-64 System V ABI

3.4.1 "Initial Stack and Register State" (Basile linked to the PDF version of this):

  1. %rsp points to the stack

    The stack pointer holds the address of the byte with lowest address which is part of the stack. It is guaranteed to be 16-byte aligned at process entry

  2. %rdx a function pointer that the application should register with atexit if it's non-zero.

    a function pointer that the application should register with

  3. %rbp is unspecified but the userland should set it to the base frame.

    The content of this register is unspecified at process initialization time, but the user code should mark the deepest stack frame by setting the frame pointer to zero.

  4. Everything else undefined.

Linux then follows it "because" the LSB says so.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
  • 1
    `%rdx` is allowed to be NULL, and in current Linux it is `0` in a fresh process from a statically-linked executable. But the dynamic linker runs before `_start` when execing a dynamically-linked executable, so it can set a value in `%rdx` (and does, according to `gdb /bin/bash`) . I'm not sure whether `_start` is still supposed to treat it as a function pointer at that point, but probably yes, because everything else at that point follows the ABI. – Peter Cordes Sep 18 '17 at 07:24
  • 2
    The i386 ABI (http://www.sco.com/developers/devspecs/abi386-4.pdf) has identical requirements on the corresponding 32-bit registers: `%esp`, `%edx`, `%ebp`, and everything else is undefined. – pts Jan 05 '18 at 12:08
  • Update: Yes, a non-zero `%rdx` will be a pointer to a function that runs destructors for dynamically linked libraries. In a dynamic executable, ld-linux.so passes it to the main executable's `_start` so it can register it with `atexit`. The ABI's process-entry guarantees fully apply to `_start` in a dynamic executable. (Including that most registers actually can hold garbage; that will be the case, unlike in a static executable where the kernel in practice zeros registers.) – Peter Cordes Nov 06 '22 at 22:46