What does this set of instructions do?

Question

   7ffff7a97759    mov    0x33b780(%rip),%rax        # 0x7ffff7dd2ee0
   7ffff7a97760    mov    (%rax),%rax
   7ffff7a97763    test   %rax,%rax
   7ffff7a97766    jne    0x7ffff7a9787a

I can't figure out what these instructions would do, can someone explain ?

I take it "not doing anything" means not jumping? It still does stuff — Shade, Apr 03 '15 at 20:02

Crowman · Accepted Answer · 2015-04-04T01:19:24.513

Going one step at a time...

7ffff7a97759    mov    0x33b780(%rip),%rax        # 0x7ffff7dd2ee0

This:

Takes the address in rip, and adds 0x33b780 to it. At this point, rip contains the address of the next instruction, which is 0x7ffff7a97760. Adding 0x33b780 to that gives you 0x7ffff7dd2ee0, which is the address in the comment.
It copies the 8 byte value stored at that address into rax.

Let's agree to call this 8 byte value "the pointer". Based on the value of the address, 0x7ffff7dd2ee0 is almost certainly a location on the stack.

7ffff7a97760    mov    (%rax),%rax

This copies the 8 byte value stored at the address in the pointer into rax.

7ffff7a97763    test   %rax,%rax

This performs a bitwise AND of rax with itself, discarding the result, but modifying the flags.

7ffff7a97766    jne    0x7ffff7a9787a

This jumps to location 0x7ffff7a9787a if the result of that bitwise AND is not zero, in other words, if the value stored in rax is not zero.

So in summary, this means "find the 8 byte value stored at the address contained in the pointer indicated by rip plus 0x33b780, and if that value is not zero, jump to location 0x7fff7a9787a". For instance, in C terms, the pointer stored at 0x7ffff7dd2ee0 might be an long *, and this code checks whether the long that it points to contains 0.

Its equivalent in C might be something like:

long l = 0;
long * p = &l;   /*  Assume address of p is 0x7ffff7dd2ee0  */


/*  Assembly instructions in your question start here  */

if ( *p == 0 ) {
    /*  This would be the instruction after the jne  */
    /*  Do stuff  */
}

/*  Location 0x7ffff7a9787a would be here, after the if block  */
/*  Do other stuff  */

Here's a full program showing the use of this construct, the only difference being we find our pointer with reference to the frame pointer, rather than to the instruction pointer:

.global _start

        .section .rodata

iszerostr:      .ascii  "Value of a is zero\n"
isntzerostr:    .ascii  "Value of a is not zero\n"

        .section .data

a:      .quad   0x00                    #  We'll be testing this for zero...

        .section .text

_start:
        mov     %rsp, %rbp              #  Initialize rbp
        sub     $16, %rsp               #  Allocate stack space
        lea     (a), %rax               #  Store pointer to a in rax...
        mov     %rax, -16(%rbp)         #  ...and then store it on stack

        #  Start of the equivalent of your code

        mov     -16(%rbp), %rax         #  Load pointer to a into rax
        mov     (%rax), %rax            #  Dereference pointer and get value
        test    %rax, %rax              #  Compare pointed-to value to zero
        jne     .notzero                #  Branch if not zero

        #  End of the equivalent of your code

.zero:
        lea     (iszerostr), %rsi       #  Address of string
        mov     $19, %rdx               #  Length of string
        jmp     .end

.notzero:
        lea     (isntzerostr), %rsi     #  Address of string
        mov     $24, %rdx               #  Length of string

.end:
        mov     $1, %rax                #  write() system call number
        mov     $1, %rdi                #  Standard output
        syscall                         #  Make system call

        mov     $60, %rax               #  exit() system call number
        mov     $0, %rdi                #  zero exit status
        syscall                         #  Make system call

with output:

paul@thoth:~/src/asm$ as -o tso.o tso.s; ld -o tso tso.o
paul@thoth:~/src/asm$ ./tso
Value of a is zero
paul@thoth:~/src/asm$

Incidentally, the reason for calculating an offset based on the instruction pointer is for improving the efficiency of position independent code, which is necessary for shared libraries. Hard coding memory addresses and shared libraries don't mix so well, but if you know code and data will always at least be the same distance apart, then referencing code and data via the instruction pointer gives you an easy way to produce relocatable code. Without that ability, it's usually necessary to have a layer of indirection, since relative branches are typically limited in range.

One quick question: why are they accessing a data element by offsetting off the instruction pointer? Is this common? Things changed a lot since I used to write asm code on the 386/486 etc.. I feel OLD! Thanks for the very excellent explanation. — Bing Bang, Apr 04 '15 at 00:57
@BingBang: It's for improving the efficiency of position independent code, which is necessary for shared libraries. Hard coding memory addresses and shared libraries don't mix so well, but if you know code and data will always at least be the same distance apart, then referencing code and data via the instruction pointer gives you an easy way to produce relocatable code. I believe this ability was introduced with x86_64. Without that ability, it's usually necessary to have a layer of indirection, since relative addresses are limited in range. — Crowman, Apr 04 '15 at 01:06

What does this set of instructions do?

1 Answers1