OCaml calling convention: is this an accurate summary?

Question

I've been trying to find the OCaml calling convention so that I can manually interpret the stack traces that gdb can't parse. Unfortunately, it seems like nothing has ever been written down in English except for general observations. E.g., people will comment on blogs that OCaml passes many arguments in registers. (If there is English documentation somewhere, a link would be much appreciated.)

So I've been trying to puzzle it out from the ocamlopt source. Could anyone confirm the accuracy of these guesses?

And, if I'm right about the first ten arguments being passed in registers, is it just not generally possible to recover the arguments to a function call? In C, the arguments would still be pushed onto the stack somewhere, if only I walk back up to the correct frame. In OCaml, it would seem that callees are free to destroy their callers' arguments.

Register allocation (from /asmcomp/amd64/proc.ml)

For calling into OCaml functions,

The first 10 integer and pointer arguments are passed in the registers rax, rbx, rdi, rsi, rdx, rcx, r8, r9, r10 and r11
The first 10 floating-point arguments are passed in the registers xmm0 - xmm9
Additional arguments are pushed onto the stack (leftmost-first-in?), floats and ints and pointers intermixed
The trap pointer (see Exceptions below) is passed in r14
The allocation pointer (presumably for the minor heap as described in this blog post) is passed in r15
The return value is passed back in rax if it is an integer or pointer, and in xmm0 if it is a float
All registers are caller-save?

For calling into C functions, the standard amd64 C convention is used:

The first six integer and pointer arguments are passed in rdi, rsi, rdx, rcs, r8, and r9
The first eight float arguments are passed in xmm0 - xmm7
Additional arguments are pushed onto the stack
The return value is passed back in rax or xmm0
The registers rbx, rbp, and r12 - r15 are callee-save

Return address (from /asmcomp/amd64/emit.mlp)

The return address is the first pointer pushed into the call frame, in accordance with amd64 C convention. (I'm guessing the ret instruction assumes this layout.)

Exceptions (from /asmcomp/linearize.ml)

The code try (...body...) with (...handler...); (...rest...) gets linearized like this:

Lsetuptrap .body
(...handler...)
Lbranch .join
Llabel .body
Lpushtrap
(...body...)
Lpoptrap
Llabel .join
(...rest...)

and then emitted as assembly like this (destinations on the right):

call .body
(...handler...)
jmp .join
.body:
pushq %r14
movq %rsp, %r14
(...body...)
popq %r14
addq %rsp, 8
.join:
(...rest...)

Somewhere in the body, there's a linearized opcode Lraise which gets emitted as this exact assembly:

movq %r14, %rsp
popq %r14
ret

Which is really neat! Instead of this setjmp/longjmp business, we create a dummy frame whose return address is the exception handler and whose only local is the previous such dummy frame. The /asmcomp/amd64/proc.ml has a comment calling $r14 the "trap pointer" so I'll call this dummy frame the trap frame. When we want to raise an exception, we set the stack pointer to the most recent trap frame, set the trap pointer to the trap frame before that, and then "return" into the exception handler. And I bet if the exception handler can't handle this exception, it just reraises it.

The exception is in %eax.

gasche · Answer 1 · 2012-07-04T06:58:10.410

This is more an answer than a question! The bit I know on this topic, I have learned by looking at the source, just like you, so don't expect further precisions to be much more authoritative than your post.

Yes, I think OCaml uses specialized calling conventions with caller-save registers only. A benefit of this choice is that it simplifies tail-calls: when you jump through a tail-call¹, you don't have to spill or reload any register.

¹: for non-self tail calls, this only works when there are not too much arguments, and therefore we don't need to spill. If stack allocation is needed, the call is turned into a non-tail call.

Note that calling conventions still depends strongly on the target architecture. On x86 for example, a small numbers of globals are used when the registers are exhausted and before spilling on the stack, to preserve tail-calls.

I also agree on "leftmost-first-in": arguments are traversed in order by calling_conventions in proc.ml, stored in offset order by slot_offset in emit.mlp; they where computed right-to-left, but returned in order, in selectgen.ml.

score 4 · Answer 2 · answered Jul 04 '12 at 15:34

Yes, you cannot recover the arguments from a call, as OCaml tries to reuse registers as much as possible, and will thus destroy their content if it is not useful anymore in the remaining of a function. Debuggers have no way to print the arguments, they can only, at a given point in the function, print the variables that are still live, but for that, you would need to modify ocamlopt to dump DWARF code to recover the values.

OCaml calling convention: is this an accurate summary?

2 Answers2