How does the size of a binary influence the execution speed? Specifically I am talking about code written in ANSI-C translated into machine language using the gnu or intel compiler. The target platform for the binary are modern computers with intel or AMD multi-core CPU's running a Linux operating system. The code performs numerical computations possibly in parallel using openMP and the binary could have several mega bytes.
Note that the execution time will in any case be much larger than the time needed to load code and libraries. I think of very specific codes used to solve large systems of ordinary differential equations for simulations of kinetic equations which are typically CPU-bound for a moderate system size but can also become memory-bound.
I am asking whether small binary size should be a design criterion for highly efficient code or if I can always give preference to explicit code (which eventually repeats code blocks which could be implemented as functions) and compiler optimizations such as loop unrolling etc.
I am aware of profiling technics and how I can apply them to specific problems, but I wonder to which extent general statements can be made.