5

From Optimizing Software in C++ (Section 7.1),

The advantage of static data is that it can be initialized to desired values before the program starts. The disadvantage is that the memory space is occupied throughout the whole program execution, even if the variable is only used in a small part of the program. This makes data caching less efficient.

The usage of static in this except is as it applies to both C and C++ in the exact case of static storage duration.

Can anyone shed some light on why (or whether) data caching is less efficient for static duration variables? Here is a specific comparison:

void foo() {
  static int static_arr[] = {/**/};
}
void bar() {
  int local_arr[] = {/**/};
}

I don't see any reason why static data would cache differently than any other kind of data. In the given example, I would think that foo will be faster because the execution stack doesn't have to load static_arr, whereas in bar, the execution stack has to load local_arr. In either case, if these functions were called repeatedly, both static_arr and local_arr will be cached. Am I wrong?

rustyx
  • 80,671
  • 25
  • 200
  • 267
okovko
  • 1,851
  • 14
  • 27
  • @Someprogrammerdude No. This issue is equally applicable to C and C++. Both tags belong. If you are confused about the usage of static in this case, then ask for further details. – okovko Feb 18 '19 at 08:38
  • Static data will cache exactly the same as any other data. I would say that the author of the book in question does not really know what he is talking about. – Johan Feb 18 '19 at 08:47
  • Does the source provide mcve illustrating that statement? If it does not then you can ignore this statement. I think that they probably imply that static data is typically stored "far" from stack variables, however it does not necessary mean that use of static variables automatically makes data caching less efficient. – user7860670 Feb 18 '19 at 08:49
  • @Johan I suspect he meant to say something else, but yeah, as it stands, the statement is nonsense. Maybe I'll email him. He does keep this book up to date. – okovko Feb 18 '19 at 08:49
  • @VTT Unfortunately no. You can take a look at Section 7.1 pages 26 and 27 for all the explanatory material provided. Okay, I will carry on reading without thinking hard about that one then. Perhaps it will be clear what he meant later in the text anyways. – okovko Feb 18 '19 at 08:51
  • I think the important part is this: "The disadvantage is that the memory space is occupied throughout the whole program execution, even if the variable is **only used in a small part of the program**." It looks like the author was making the case for an example where the function was only getting little use so the variable would only "exist" in a little bit of the program's total running time, as opposed to having it be global and existing in memory all the time. Which seems like a valid point - it really should be worded more clearly though. – Blaze Feb 18 '19 at 09:15

3 Answers3

6

In general, Agner Fog usually knows what he is talking about.

If we read the quote in the context of section 7.1 Different kinds of variable storage, we see what he means by "less efficient caching" in the beginning of the section:

Data caching is poor if data are scattered randomly around in the memory. It is therefore important to understand how variables are stored. The storage principles are the same for simple variables, arrays and objects.

So the idea behind saying that static variables are less cache-efficient is that the chance that the memory location where they are stored is "cold" (no longer in cache) is greater than with stack memory, which is where the variable with automatic storage duration would be stored.

With caching and paging in mind, it's the combination of physical and temporal locality of data storage that affects performance.

rustyx
  • 80,671
  • 25
  • 200
  • 267
  • 1
    Feels awkward to apply the concept of data locality to static data. I mean, am I wrong to think of BSS memory as always cached? It feels like a misapplication of concepts. Stack and heap are lazy mapped to pages, but BSS is a set size and always loaded. This requires further thinking. – okovko Feb 18 '19 at 09:41
  • I mean, the compiler knows exactly when which parts of BSS will be needed. So surely it will indicate to the CPU to cache it. Does it not work like that? I am naive. – okovko Feb 18 '19 at 09:50
  • At processor level there is no "static data", there are just memory pages and cache lines. The C++ compiler will reorder memory accesses to start fetching data as early as possible, but the problem with [static local variables](https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables) is that the current C++ standard requires that they are atomically initialized the first time the block is executed. That makes them extra slow due to the locking overhead and the inability to hoist the memory access out of the block. – rustyx Feb 18 '19 at 10:14
  • Compilers usually stay out of the caching decision-making, since such decisions require static as well as dynamic knowledge (how the program actually behaves when it runs), the actual target CPU platform (there are differences between vendors) and the context in which the application runs (the cache is shared with other apps and OS). – rustyx Feb 18 '19 at 10:24
  • By the way, the [`.bss` section](https://en.wikipedia.org/wiki/.bss) is for zero-initialized global storage. A global array that is not zero-initialized will be stored in the `.data` section (or `.rdata` if it is `const`). In any case yes the memory pages will be pre-allocated but whether or not they will be in *cache* depends on the application's total memory usage patterns (the [active working set](https://en.wikipedia.org/wiki/Working_set)). – rustyx Feb 18 '19 at 10:39
  • Okay, thank you very much for clearing up my misconceptions for me. I'm much clearer on what's going on now! – okovko Feb 18 '19 at 11:27
  • Still find that very poorly phrased, as heap storage would also have a similar issue, and I'm not even talking about ASLR that makes the comment outdated IMHO. – Matthieu Brucher Feb 18 '19 at 11:28
5

The answer from rustyx explains it. Local variables are stored on the stack. The stack space is released when a function returns and reused when the next function is called. Caching is more efficent for local variables because the same memory space is reused again and again, while static variables are scattered around at different memory addresses that can never be reused for another purpose. Whether static data are stored in the DATA section (initialized) or the BSS section (uninitalized) makes no difference in this respect. The top-of-stack space is likely to stay cached throughout program execution and be reused many times.

Another advantage is that a limited number of local variables can be accessed with an 8-bit offset relative to the stack pointer, while static variables need a 32-bit absolute address (in 32-bit x86) or a 32-bit relative address (in x86-64). In other words, local variables may make the code more compact and improve utilization of the code cache as well as the data cache.

// Example
int main () {
  f();
  g();
  return 0;
}

void f() {
   int x; 
   ...
}

void g() {
   int y;  // y may occupy the same memory address as x
   ...
}
A Fog
  • 4,360
  • 1
  • 30
  • 32
  • I see! So if `x` and `y` were static, then the top of the stack would not be reused and the program would be using "cold" data. This is an excellent response and I think it's incredible to get the author of the manual in question to give such a clear response to the confusion. Your manual could possibly benefit from the inclusion of this example in Section 7.1. Thank you! – okovko Feb 18 '19 at 11:36
4

The statement does or does not make sense depending on how you punctuate it:

Reading 1:

The advantage of static data is that it can be initialized to desired values before the program starts. The disadvantage is that the memory space is occupied throughout the whole program execution, even if the variable is only used in a small part of the program.

All of the above makes data caching less efficient.

This is nonsense.

Reading 2:

The advantage of static data is that it can be initialized to desired values before the program starts.

The disadvantage is that the memory space is occupied throughout the whole program execution...

...even if the variable is only used in a small part of the program. There is a case where this could make data caching less efficient.

That case would be where the static variable has been allocated storage either in a page that is not always swapped in, or is on a cache line that is rarely otherwise used. You may incur a cache miss, or theoretically in the worst case a page fault (although frankly, with the amount of physical memory at our disposal these days, if this happens you have bigger problems).

In the specific case you demonstrate, the answer would be, "it depends".

Yes, the initialisation of static_arr is a one-time-only operation and so can be thought of as costless.

Yes, the initialisation of local_arr happens each time the function is called, but it might be that:

  1. this initialisation is trivial, or
  2. the initialisation is elided by the compiler as part of an optimiser pass

In general, unless you have a specific optimisation in mind, it is Better(tm) to write the code that explicitly states the behaviour you want, i.e.:

  • use static variables (variables with static storage duration) when the variable/array's value(s) should survive successive calls to the function.

  • use local variables (strictly, variables with automatic storage duration) when the existing values are meaningless on entry or exit from the function.

You will find that the compiler will in almost all cases, do the most efficient thing after the optimisation pass(es).

There is a specific case case where static initialisation is Better(tm). In the case of (say) a buffer that requires dynamic allocation. You may not want to incur the cost of allocation/deallocation on every call. You may want the buffer to dynamically grow when needed, and stay grown on the basis that future operations may well need the memory again.

In this case, the actual state of the variable is the size of its allocated buffer. Thus, state is important on the function's entry and exit, eg:

  std::string const& to_string(json const& json_object)
  {
    static thread_local buffer;               // thread-safe, auto-expanding buffer storage
    buffer.clear();                           // does not release memory
    serialise_to_string(buffer, json_object); // may allocate memory occasionally
    return buffer;
  }
Richard Hodges
  • 68,278
  • 7
  • 90
  • 142
  • Can you explain what you mean by static duration variables being on a given page? Static variables are never on lazy mapped stack / heap, as far as I know. Should be on BSS, which has a set size and is loaded with the program itself. You should update your answer to be more correct, or perhaps I am mistaken. – okovko Feb 18 '19 at 09:35
  • @okovko the BSS section could be swapped out if the OS's memory resources are under heavy load. This the is reason for my "you have other problems" comment. – Richard Hodges Feb 18 '19 at 11:24