10

Noticed strange things when generating assembly code

func foo(v uint64) (b [8]byte) {
    b[0] = byte(v)
    b[1] = byte(v >> 8)
    b[2] = byte(v >> 16)
    b[3] = byte(v >> 24)
    b[4] = byte(v >> 32)
    b[5] = byte(v >> 40)
    b[6] = byte(v >> 48)
    b[7] = byte(v >> 56)
    return b
} 
func foo(v uint64) [8]byte {
    var b [8]byte

    b[0] = byte(v)
    b[1] = byte(v >> 8)
    b[2] = byte(v >> 16)
    b[3] = byte(v >> 24)
    b[4] = byte(v >> 32)
    b[5] = byte(v >> 40)
    b[6] = byte(v >> 48)
    b[7] = byte(v >> 56)
    return b
}

generated this assembly code

"".foo STEXT nosplit size=20 args=0x10 locals=0x0 funcid=0x0
    0x0000 00000 (main.go:6)    TEXT    "".foo1(SB), NOSPLIT|ABIInternal, $0-16
    0x0000 00000 (main.go:6)    FUNCDATA    $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x0000 00000 (main.go:6)    FUNCDATA    $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x0000 00000 (main.go:6)    MOVQ    $0, "".b+16(SP)
    0x0009 00009 (main.go:15)   MOVQ    "".v+8(SP), AX
    0x000e 00014 (main.go:15)   MOVQ    AX, "".b+16(SP)
    0x0013 00019 (main.go:16)   RET

and

"".foo STEXT nosplit size=59 args=0x10 locals=0x10 funcid=0x0
    0x0000 00000 (main.go:6)    TEXT    "".foo(SB), NOSPLIT|ABIInternal, $16-16
    0x0000 00000 (main.go:6)    SUBQ    $16, SP
    0x0004 00004 (main.go:6)    MOVQ    BP, 8(SP)
    0x0009 00009 (main.go:6)    LEAQ    8(SP), BP
    0x000e 00014 (main.go:6)    FUNCDATA    $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x000e 00014 (main.go:6)    FUNCDATA    $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x000e 00014 (main.go:6)    MOVQ    $0, "".~r1+32(SP)
    0x0017 00023 (main.go:7)    MOVQ    $0, "".b(SP)
    0x001f 00031 (main.go:16)   MOVQ    "".v+24(SP), AX
    0x0024 00036 (main.go:16)   MOVQ    AX, "".b(SP)
    0x0028 00040 (main.go:17)   MOVQ    "".b(SP), AX
    0x002c 00044 (main.go:17)   MOVQ    AX, "".~r1+32(SP)
    0x0031 00049 (main.go:17)   MOVQ    8(SP), BP
    0x0036 00054 (main.go:17)   ADDQ    $16, SP
    0x003a 00058 (main.go:17)   RET

in the second case, you can see that the compiler sees that there is a local variable. why is this happening ? why is such different code generated?

go version go1.16 windows/amd64

file with asm code from

go tool compile -S mail.go > main.s

https://go.godbolt.org/z/G8K79K48G - small asm code

https://go.godbolt.org/z/Yv853E6P3 - long asm code

phuclv
  • 37,963
  • 15
  • 156
  • 475
Pavel Burak
  • 101
  • 1
  • 3
  • 6
    Opportunity for improvement for the Go optimizer I guess? Return values are allocated in the stack frame of the caller. The first version is filling that directly. The second version fills a local array and then copies it to the caller. – rustyx Jul 13 '21 at 23:42
  • This is a complete guess, and not even concrete, but I think that in general any language which is capable of generating readable stack traces at any time needs to impose extreme limitations on what kind of optimizations can be done to function entry and exit. – Zyl Jul 14 '21 at 09:02
  • The `BP` register is an amd64 specific callee-save that is automatically added when the frame size is non-zero. I've not so much as looked at any asm in years, but because `b` is local to the callee, hence the callee-save. The rest is pretty much the same as before, with the difference being that the array is copied to the caller, and the stack pointer is restored – Elias Van Ootegem Oct 27 '21 at 14:38

1 Answers1

1

It is the difference between

func foo(v uint64) [8]byte {

and

func foo(v uint64) (b [8]byte) {

When you specify the return as [8]byte, you are simply informing the compiler of the return type for foo.

However, (b [8]byte) not only does the above by specifying the return type, but also

  • allocates 8 bytes of memory
  • declares the variable b, of type [8]byte
  • initializes b to the zero-filled allocated 64 bits.

When you are manually replicating (b [8]byte) by using

var b [8]byte

It then has to go through the bullet pointed list specified above manually.

0x0000 00000 (main.go:6)    SUBQ    $16, SP
0x0004 00004 (main.go:6)    MOVQ    BP, 8(SP)
0x0009 00009 (main.go:6)    LEAQ    8(SP), BP
rnath
  • 146
  • 5
  • 2
    I believe OP's question was more to the tune of "Why do these two functions which are identical in behavior not optimize to the same assembly?" (something one might find reasonable to expect in spite of the reality) – Zyl Sep 21 '21 at 16:06