4

The Go code is here:

package main

func add(a, b int) int {
    sum := 0
    sum = a + b
    return sum
}
func main() {
    println(add(1, 2))
}

The Go version is

$ go version
go version go1.19.1 darwin/amd64

I use the following command to get assembly:

$ go tool compile -N -l -S main.go

main.add STEXT nosplit size=70 args=0x10 locals=0x18 funcid=0x0 align=0x0
        0x0000 00000 (main.go:3)        TEXT    main.add(SB), NOSPLIT|ABIInternal, $24-16
        0x0000 00000 (main.go:3)        SUBQ    $24, SP
        0x0004 00004 (main.go:3)        MOVQ    BP, 16(SP)
        0x0009 00009 (main.go:3)        LEAQ    16(SP), BP
        0x000e 00014 (main.go:3)        FUNCDATA        $0, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
        0x000e 00014 (main.go:3)        FUNCDATA        $1, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
        0x000e 00014 (main.go:3)        FUNCDATA        $5, main.add.arginfo1(SB)
        0x000e 00014 (main.go:3)        MOVQ    AX, main.a+32(SP)
        0x0013 00019 (main.go:3)        MOVQ    BX, main.b+40(SP)
        0x0018 00024 (main.go:3)        MOVQ    $0, main.~r0(SP)
        0x0020 00032 (main.go:4)        MOVQ    $0, main.sum+8(SP)
        0x0029 00041 (main.go:5)        MOVQ    main.a+32(SP), AX
        0x002e 00046 (main.go:5)        ADDQ    main.b+40(SP), AX
        0x0033 00051 (main.go:5)        MOVQ    AX, main.sum+8(SP)
        0x0038 00056 (main.go:6)        MOVQ    AX, main.~r0(SP)
        0x003c 00060 (main.go:6)        MOVQ    16(SP), BP
        0x0041 00065 (main.go:6)        ADDQ    $24, SP
        0x0045 00069 (main.go:6)        RET
        0x0000 48 83 ec 18 48 89 6c 24 10 48 8d 6c 24 10 48 89  H...H.l$.H.l$.H.
        0x0010 44 24 20 48 89 5c 24 28 48 c7 04 24 00 00 00 00  D$ H.\$(H..$....
        0x0020 48 c7 44 24 08 00 00 00 00 48 8b 44 24 20 48 03  H.D$.....H.D$ H.
        0x0030 44 24 28 48 89 44 24 08 48 89 04 24 48 8b 6c 24  D$(H.D$.H..$H.l$
        0x0040 10 48 83 c4 18 c3                                .H....
main.main STEXT size=86 args=0x0 locals=0x20 funcid=0x0 align=0x0
        0x0000 00000 (main.go:8)        TEXT    main.main(SB), ABIInternal, $32-0
        0x0000 00000 (main.go:8)        CMPQ    SP, 16(R14)
        0x0004 00004 (main.go:8)        PCDATA  $0, $-2
        0x0004 00004 (main.go:8)        JLS     79
        0x0006 00006 (main.go:8)        PCDATA  $0, $-1
        0x0006 00006 (main.go:8)        SUBQ    $32, SP
        0x000a 00010 (main.go:8)        MOVQ    BP, 24(SP)
        0x000f 00015 (main.go:8)        LEAQ    24(SP), BP
        0x0014 00020 (main.go:8)        FUNCDATA        $0, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
        0x0014 00020 (main.go:8)        FUNCDATA        $1, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
        0x0014 00020 (main.go:9)        MOVL    $1, AX
        0x0019 00025 (main.go:9)        MOVL    $2, BX
        0x001e 00030 (main.go:9)        PCDATA  $1, $0
        0x001e 00030 (main.go:9)        NOP
        0x0020 00032 (main.go:9)        CALL    main.add(SB)
        0x0025 00037 (main.go:9)        MOVQ    AX, main..autotmp_0+16(SP)
        0x002a 00042 (main.go:9)        CALL    runtime.printlock(SB)
        0x002f 00047 (main.go:9)        MOVQ    main..autotmp_0+16(SP), AX
        0x0034 00052 (main.go:9)        CALL    runtime.printint(SB)
        0x0039 00057 (main.go:9)        CALL    runtime.printnl(SB)
        0x003e 00062 (main.go:9)        NOP
        0x0040 00064 (main.go:9)        CALL    runtime.printunlock(SB)
        0x0045 00069 (main.go:10)       MOVQ    24(SP), BP
        0x004a 00074 (main.go:10)       ADDQ    $32, SP
        0x004e 00078 (main.go:10)       RET
        0x004f 00079 (main.go:10)       NOP
        0x004f 00079 (main.go:8)        PCDATA  $1, $-1
        0x004f 00079 (main.go:8)        PCDATA  $0, $-2
        0x004f 00079 (main.go:8)        CALL    runtime.morestack_noctxt(SB)
        0x0054 00084 (main.go:8)        PCDATA  $0, $-1
        0x0054 00084 (main.go:8)        JMP     0
        0x0000 49 3b 66 10 76 49 48 83 ec 20 48 89 6c 24 18 48  I;f.vIH.. H.l$.H
        0x0010 8d 6c 24 18 b8 01 00 00 00 bb 02 00 00 00 66 90  .l$...........f.
        0x0020 e8 00 00 00 00 48 89 44 24 10 e8 00 00 00 00 48  .....H.D$......H
        0x0030 8b 44 24 10 e8 00 00 00 00 e8 00 00 00 00 66 90  .D$...........f.
        0x0040 e8 00 00 00 00 48 8b 6c 24 18 48 83 c4 20 c3 e8  .....H.l$.H.. ..
        0x0050 00 00 00 00 eb aa                                ......
        rel 33+4 t=7 main.add+0
        rel 43+4 t=7 runtime.printlock+0
        rel 53+4 t=7 runtime.printint+0
        rel 58+4 t=7 runtime.printnl+0
        rel 65+4 t=7 runtime.printunlock+0
        rel 80+4 t=7 runtime.morestack_noctxt+0
go.cuinfo.producer.<unlinkable> SDWARFCUINFO dupok size=0
        0x0000 2d 4e 20 2d 6c 20 72 65 67 61 62 69              -N -l regabi
go.cuinfo.packagename.main SDWARFCUINFO dupok size=0
        0x0000 6d 61 69 6e                                      main
main..inittask SNOPTRDATA size=24
        0x0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        0x0010 00 00 00 00 00 00 00 00                          ........
gclocals·g2BeySu+wFnoycgXfElmcg== SRODATA dupok size=8
        0x0000 01 00 00 00 00 00 00 00                          ........
main.add.arginfo1 SRODATA static dupok size=5
        0x0000 00 08 08 08 ff                                   .....

In my understanding, in the processing of the function main call function add, the stack would be(SP is the top of function add):

  • 0~8: ~ro, the return value of add
  • 8~16: sum, the local variable sum
  • 16~24: BP, use to return caller main
  • 32~40: a, the parameter a of add
  • 40~456: b, the parameter b of add

But what is stored in the 24~32? I can not get it by reading assembly.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
spongecaptain
  • 121
  • 1
  • 11
  • 2
    Probably nothing; this is probably being used for alignment. x86 CPUs perform better with the stack pointer aligned to be zero-mod-16, so that using `subq $32.sp` makes the code run faster than using `subq $24,sp`. The compiler knows (or at least believes) this and therefore does it to make your code run faster. – torek Oct 12 '22 at 04:19
  • 1
    @torek: Nothing *directly* performs slower with a stack pointer that's only aligned by 8. But 16-byte stack alignment makes it free to get 16-byte alignment for local vars when needed, e.g. so SIMD can be used a bit more efficiently, or `lock cmpxchg16b` can be used on a pair of pointers for lock-free atomics. When x86-64 was new, misaligned SIMD was a lot more expensive (e.g. `movups` was slow even if the memory was aligned), so it made a lot of sense for the x86-64 System V calling convention to maintain 16B alignment. – Peter Cordes Oct 12 '22 at 07:34
  • It seems Go uses its own custom calling convention (EAX and EBX for the first 2 args surprisingly, so EBX is call-clobbered, unlike every other x86 convention). But I assume they liked the 16-byte stack alignment idea from x86-64 SysV and Windows x64, and kept that. – Peter Cordes Oct 12 '22 at 07:37
  • @PeterCordes: That's why I added the "or at least believes" part. :-) I'm not actually up on how the latest implementations behave these days. Timings are so tricky now, with out of order execution, branch caches, etc. – torek Oct 12 '22 at 08:24

0 Answers0