2

I would like to understand the following behavior of GNU as.

The following test program on OS X (Apple cctools-822/GNU as 1.38)

    .globl foo
    jmp foo
foo:
    ret

is encoded to

    00000000    e900000000    jmp         0x00000005
foo:
    00000005    c3            ret

while GNU as on Linux (GNU as 2.22) encodes to

                                .global foo
    0000        E9FCFFFF        jmp 0x35 # foo
                FF
foo:
    0005        C3              ret

Why does the latter does a (to me) weird jump?

Moreover, apparently this magic 0xfcffffff address is used for every jump to a global label:

test2.s

    .globl foo
    jmp foo
    .globl bar
    jmp bar
    .globl baz
    jmp baz
foo:
    push $1
    ret
bar:
    push $2
    ret
baz:
    push $3
    ret

produces with GNU as on linux (GNU as 2.22)

                        .globl foo
    0000    E9FCFFFF    jmp foo
            FF
                        .globl bar
    0005    E9FCFFFF    jmp bar
            FF
                        .globl baz
    000a    E9FCFFFF    jmp baz
            FF
foo:
    000f    6A01        push $1
    0011    C3          ret
bar:
    0012    6A02        push $2
    0014    C3          ret
baz:
    0015    6A03        push $3
    0017    C3          ret

Can anyone explain this kind of behavior?

Tobias
  • 3,026
  • 20
  • 34

2 Answers2

2

It is just a different type of relocation entry (R_386_PC32).

You don't have to worry about it, the linker will insert the correct address.

You can see the relocation entries if you add the -r option for objdump, e.g.

objdump -Dr test2.o

Note that the value is 0xfffffffc = -4 as x86 is little endian.

See also this question.

Community
  • 1
  • 1
starblue
  • 55,348
  • 14
  • 97
  • 151
1

I assume you are disassembling the object not the executable? This would be very typical for all toolchains, all languages that compile to object before liking. The linker...links...the objects, links the globals together, function names, variables, etc. Until the link stage you have no way of knowing what address space you are in as well as the variable names so some locals depending on the instruction set and length of reach as well as globals cannot be resolved until link time so the object will have some sort of filler data placed instead of the instruction which will probably disassemble in a strange way.

old_timer
  • 69,149
  • 8
  • 89
  • 168
  • Then how does the Apple assembler know the right places beforehand and the linux/gnu one does not? – Tobias Jun 12 '12 at 10:09
  • The assembler would have had to have the target address sent to it like the old days. (.org 0x1234) or some other form, for assemble to object then link to executable this could be how you tell the linker where to put it. Each toolchain can have its own solution. Generally if doing a two step (language to object, objects linked to executable) the objects cannot have global knowledge nor how far away (to know if it can use a near or far access) so placeholders are built in to be resolved by the linker. by definition that is what a linker does. – old_timer Jun 12 '12 at 13:50
  • note that doing something like gcc myprog.c -o myprog *IS* doing the two step it compiles to object then separately calls ld to link, then cleans up the intermediate files so you dont know that three or four programs were involved going from .c to executable. – old_timer Jun 12 '12 at 13:52
  • Well, incidentially, I only ran `as`, neither `gcc` nor `ld` but still the Apple assembler is generating relative jumps with the “right” destination immediately – Tobias Jun 12 '12 at 13:56
  • please post example, it probably makes sense what is going on. – old_timer Jun 12 '12 at 14:37
  • the example you posted above makes sense, is what I am talking about, post code with a global, unresolved, destination. – old_timer Jun 12 '12 at 14:39
  • Ok, now I see that the Apple assembler actually uses a 32bit 0x0 as placeholder in my example. So, yes, you are right. Thank you. – Tobias Jun 12 '12 at 19:38