Is there any performance impact on declaring a variable inside critical section block compared to when declared outside?

Question

Suppose there is a code as shown below

void func1()     // first way
{
    CRITICALSECTIONTYPE CS;
    ENTERCRITICALSECTION(CS);
    int x = getValue();
    LEAVECRITICALSECTION(CS);
}
void func2()    // second way
{
    int x;
    CRITICALSECTIONTYPE CS;
    ENTERCRITICALSECTION(CS);
    x = getValue();
    LEAVECRITICALSECTION(CS);
}

Is there any (even the slightest) possibility of performance overhead in first way compared to second? Anything specific to compilers optimizing it? Answer with assembly code preferred. Thanks :)

You know that you can get the compiler to output assembly code listings for yourself, right? `/FAs` on MSVC, `-S` on GCC. Saves asking a question each time, this doesn't scale very well. — Cody Gray - on strike, Aug 24 '14 at 17:18
@Codegray, that trick works only if you know how to read assembly code. — R Sahu, Aug 24 '14 at 18:08
There shouldn't be any difference. You should be able to test that out in a simple program. — R Sahu, Aug 24 '14 at 18:11

Christophe · Accepted Answer · 2014-08-25T19:04:48.167

2

Of course the answer might be compiler dependent.

However, I've compiled a programme with a loop and a block variable created in the critical section. Then I have recompiled it with the creation of the variable outside the loop.

The assembler code generated (MSVC13, debugging mode without optimizing) is exacly the same for an unitilized variable. In fact, the compiler generates the reservation of the required stack space at entry of the function, so that nothing needs to be done when entering the critical section.

I experimented a litle bit with some variations on your question:

With intialised variable, the compiler generates the additional initialisation instructions where you put it in the code, potentially in the critical section.
With uninitialized dynamic array in the auto storage (example: char y[n];) the principle is the same: no additional instruction will be in the critical section. Why so ? because the standard accept these dynamic arrays only if the size (here n) is constant. So again, at code generation time, the compiler knows how much space nned to be allocated on the stack at function entry.
With more complex objects if a constructor needs to be called , then the corresponding instructions would be necessarily be performed in the critical section.

In any case, keep in mind that even when you add code in the critical section, the optimizer could still find ways to optimize it (ex: constant propagation, detecting loop invariants, etc..).

Edit

At your request, here an extract of ASM code for the first case. Sorry for the big screenshot, but it was the only mean to show code comparison easily. The difference are highlighted in yellow and gray.

You'll notice that differences are only the comments corresponding to the C++ source, and the lines where b is used (sollely, because the stack offset is named _b$1 for block variable and _b$ for function variable). (1) stack offset to access to to the variables (2) entry point in the function (3) example of local variable initialisation (4) variable in critical section (left variable is created inside the section, right outside).

enter image description here

edited Aug 25 '14 at 19:04

answered Aug 24 '14 at 19:54

Christophe

68,716
7
72
138

Looking at unoptimized code is essentially worthless. You don't care about the performance of debug builds. – Cody Gray - on strike Aug 25 '14 at 08:46
can you add the assembly code for optimized and unoptimized versions as well? – Abhinav Aug 25 '14 at 10:28
@Abhinav here the unoptimized code to show you how it works. But best would be that you try to generate the files, as explained by Cody Gray. Even if you're not familiar with assembler, with a diff tool like winmerge, it's easy to find the differences. – Christophe Aug 25 '14 at 19:09
@CodyGray yes, you're fully right ! I did it here to show that even in unoptimized code there's no performance overhead, meaninng that in optimized version it could be only better. – Christophe Aug 25 '14 at 19:12
1

Not sure about that "in optimized version it could be only better". Sometimes optimizations push variable initializations down closer to the point where the variable is actually used, so in some cases the initialization is never even performed (e.g. if the function returns before the variable is even used). That wouldn't be the case in this example since the variable isn't initialized at the point it's created, but in another example it might be. In that case, the optimized version might be worse in this regard... the initialization might be done in the critical section. – phonetagger Aug 25 '14 at 19:36
Optimized code can only be "better" if by "better" you mean "more optimized". But that also tends to imply "different from what one would naively expect". Debug builds generally write the assembly code that I would write—I'm often surprised by what an optimizing compiler does with Release builds. I'll grant it can sometimes be useful to look at Debug source to understand the *semantics* of something (although most people grasp C++ code more readily than assembly), but it is a poor way of assessing performance impacts. The limited registers on x86-32 is a big constraint, like phonetagger said. – Cody Gray - on strike Aug 26 '14 at 03:45
Sorry if my original comment sounded gruff, though. I was typing on my phone. I was just trying to encourage you to provide the optimized assembly in your answer, since it seems to me more germane to the question of performance. And because I see lots of well-meaning answers here that discuss Debug builds, where performance and code-generation is, in my opinion, quite irrelevant. – Cody Gray - on strike Aug 26 '14 at 03:47
@CodyGray Yes the question is related to performance and optimizations might vary from compiler to compiler. I will try it out myself shortly. Thanks Cristophe and other guys for your time – Abhinav Aug 26 '14 at 12:19

Is there any performance impact on declaring a variable inside critical section block compared to when declared outside?

1 Answers1