gcc does have the ability to use multi-byte NOPs for aligning loops and functions. However when I tried the -fpatchable-function-entry
option it always emits single-byte NOPs
You can see in this example that gcc aligns the function with nop DWORD PTR [rax+rax*1+0x0]
and nop WORD PTR cs:[rax+rax*1+0x0]
but uses eight 0x90 NOPs at the function entry when I specify -fpatchable-function-entry=8,3
I saw this in the document
-fpatchable-function-entry=N[,M]
- Generate N NOPs right at the beginning of each function, with the function entry point before the Mth NOP. If M is omitted, it defaults to 0 so the function entry points to the address just at the first NOP. The NOP instructions reserve extra space which can be used to patch in any desired instrumentation at run time, provided that the code segment is writable. The amount of space is controllable indirectly via the number of NOPs; the NOP instruction used corresponds to the instruction emitted by the internal GCC back-end interface
gen_nop
. This behavior is target-specific and may also depend on the architecture variant and/or other compilation options.
It clearly said that N NOPs will be inserted. However I think this should be an N-byte NOP (or whatever optimal number of NOPs to fill the N-byte space). Similarly if M is specified you need to emit an M-byte and an (N − M)-byte NOP
So why does gcc do this? Can we make it generate multi-byte NOPs? And are two 0x90 NOPs better than Microsoft's mov edi, edi
?