Let's say I have a function called from within a tight loop, that allocates a few large POD arrays (no constructors) on the stack in one scenario, vs. I allocate the arrays dynamically once and reuse them in each iteration. Do local arrays add run-time cost or not?
As I understand it, allocating local POD variables comes down to shifting the stack pointer, so it shouldn't matter much. However, few things come to mind that may potentially affect the performance:
Checking for stack overflow - who and when does these checks, how often? On some systems stacks can grow automatically, but again, I know very little about this.
Cache considerations: is the stack treated in a special way by the CPU cache, or it's no different from the rest of data?
Are variadic arrays any different with respect to the above? Say, for constant-sized arrays the stack can be somehow preallocated (or pre-computed by the compiler?), whereas for variadic ones something else is involved that adds run-time cost. Again I have no idea how this works.