In the majority of cases a compiled procedure is a bunch of processor instructions that occupies continuous range of bytes in the code section. It of course may contain conditional and unconditional jumps and non-linear execution flow, but looking at disassembly listing you can say definitely where is the beginning (i.e. an entry point) of the procedure and where the procedure ends.
However, sometimes CL splits procedures into parts and mixes these parts together so that you can get proc_b between the first half of proc_a and the second half of proc_a.
The question is: what command-line switch makes the compiler generate code as described below.
I was analyzing an executable in my debugger/disassembler and noticed large number of functions having fragmented bodies. I have the binary itself, debug symbols for it, I know it was compiled using CL, but I don't have sources, makefile and thus I have no idea what command-line options were used to compile it.
Let me show you a little example (it's just a demonstration code, not from real-life).
Say you have the following function written in C++ (boo
is a class, methods are virtual
):
int foo(boo *x)
{
if(x->ready == 0)
{
return 0;
}
else
{
x->func_a(x->bzz);
x->func_b(x->kee);
return x->func_c();
}
}
Now being run with some mysterious command-line option(s), CL decides to take small portion of instruction (representing return 0;
branch of condition) and move it towards the end of code section far away from the boundaries of the basic part of foo procedure. Moreover, this small portion of instructions will have its own entry in debug symbols table having a name composed of the name of the procedure being split, an underscore character and then a decimal number representing (in most cases, but not in all cases) an offset of jump-destination (which is offset itself) within jump instruction relative to the procedure entry point (e.g. foo_13).
So, CL compiles as follows:
foo:
push ebp
mov ebp, esp
push edi
mov edi, [esp+8]
cmp [edi+4], 0
je foo_X <----- jump down below to the isolated (!) piece of 'foo'
push esi
mov esi, [edi]
mov ecx, edi
push [edi+8]
call [esi]
push [edi+12d]
mov ecx, edi
call [esi+4]
mov ecx, edi
call [esi+8]
pop esi
pop edi <---- return from small piece 'foo_X' leads here
pop ebp
retn 4
OtherFunc1:
<code for other function>
<code for other function>
<code for other function>
OtherFunc2:
<code for other function>
<code for other function>
<code for other function>
<many many code not related to 'foo' at all>
foo_X:
xor eax, eax
jmp <address of 'pop edi' within main part of 'foo'>
foo_X (X stands for some decimal number as described above) is a small two-instruction chunk representing a true-branch of if-statement.
In my case there are whole bunch of such tiny chunks. Most of them (but not all) are two-instruction ones (resetting some register by xor reg, reg
and jumping back to the main part of function, or zeroing EAX and performing RETN). And they all have their own names in debug symbols table. If we have functions like foo, bar and baaz, there are also foo_7, bar_22, bar_43, baaz_19. Most of these tiny chunks are grouped together and lay close to each other in the code section, but far away from their counterparts (foo, bar, baaz), so that jumps to these chunks go all over the code section.
It may be related to an optimization based on branch prediction: compiler moves an execution branch which it considers unlikely to happen away from the base execution flow path. However, I observe these tricks in the binary compiled in 1998, so, obviously, CL.EXE from MSVC6 (or even MSVC5!) was used, and there's no way to give a hint regarding branch prediction to the optimizer of those old versions of CL. Yeah, modern versions of CL support profile-guided optimization, but we are talking about code compiled in 1998.
One of options that could be what I am looking for is /Gy — an option for a so-called function-level linking. This options instructs the compiler to wrap each individual function into a separate COMDAT so that at linking time functions may be reordered in any desired order and some of them may be excluded. However, as far as I can see, it wraps an entire function into a single COMDAT, but for my case it would need to place separate fragments of a single function into separate COMDATs to (1) allow linker to place fragments of functions far away from each other and (2) allow such small fragments to have their own name in debug symbols table.
Once again, my question is what command line options/switches for CL/LINK control this behavior.