I've been going through some Assembly Programming Videos to get a better understanding of how to manually optimize the *.s files left after compiling with gcc/g++ -S ...
One of the topics covered was Refactoring Redundant Code that demonstrates how to move redundant code to its own labeled block ending with a ret and replacing it with a call.
The example given in the video is 2 blocks containing:
mov eax,power
mul ebx
mov power,eax
inc count
which it replaces with call CalculateNextPower
and CalculateNextPower looks like:
CalculateNextPower:
mov eax,power
mul ebx
mov power,eax
inc count
ret
Out of curiosity, trying to reduce compiled size, I compiled some C and C++ projects with -S and various optimizations including -Os,-O2,-O3,-pipe,-combine and -fwhole-program and analyzed the resulting *.s files for redundancy using a lightly patched (for .s files) version of duplo. Only -fwhole-program (now deprecated IIRC) had a significant effect toward eliminating duplicate code across files (I assume its replacement(s) -flto would behave similarly during link time - roughly equivalent to compiling with -ffunction-sections -fdata-sections and linking with --gc-sections) but still misses significantly large blocks of code.
Manual optimization using the duplo output resulted in ~10% size reduction in a random C project and and almost 30% in a random C++ project when only contiguous blocks of assembly with at least 5 contiguous duplicate instructions were deduplicated.
Am I missing a compiler option (or even a standalone tool) that eliminates redundant assembly automatically when compiling for size (including other compilers: clang, icc, etc..) or is this functionality absent (for a reason?)?
If it doesn't exist, it could be possible to modify duplo to ignore lines starting with a '.' or ';' (and others?) and replace duplicate code blocks with calls to functions with the duplicate code, but I am open to other suggestions that would work directly with the compiler's internal representations (preferably clang or gcc).
Edit: I patched duplo to identify blocks of duplicate assembly here, but it still requires manual refactoring at the moment. As long as the same compiler is used to generate the code, it may be possible (but probably slow) to identify the largest blocks of duplicate code, put them in their own "function" block and replace the code with a CALL to that block.