1

My GCC version is 12.2.0. It has a flag -fconstexpr-cache-depth.

       -fconstexpr-cache-depth=n
           Set the maximum level of nested evaluation depth for C++11 constexpr functions that will
           be cached to n.  This is a heuristic that trades off compilation speed (when the cache
           avoids repeated calculations) against memory consumption (when the cache grows very large
           from highly recursive evaluations).  The default is 8.  Very few users are likely to want
           to adjust it, but if your code does heavy constexpr calculations you might want to
           experiment to find which value works best for you.

I'm curious that when will this flag takes effect. So I borrowed a code piece from this answer. Here is the code.

#include <array>
#include <vector>
#include <algorithm>

constexpr auto primes_num_vector = [] {
  constexpr int N = 1 << 16;
  std::vector<int> ret;
  bool not_prime[N] = {};
  for (int i = 2; i < N; i++) {
    if (!not_prime[i]) {
      ret.push_back(i);
      for (int j = 2 * i; j < N; j += i) not_prime[j] = true;
    }
  }
  return ret;
};

constexpr auto primes = [] {
  std::array<int, primes_num_vector().size()> ret;
  std::ranges::copy(primes_num_vector(), ret.begin());  // comment out this line in the second bench
  return ret;
}();

Since this constexpr function costs a noticeable time to compute, I can determine whether or not the result is cached by measuring the compilation time. Here is the command used for measuring.

$ hyperfine 'g++ -std=c++20 -Wall -Wextra -pedantic test.cpp -o /tmp/bin/test'

This is the result of the original code:

  Time (mean ± σ):      1.365 s ±  0.030 s    [User: 1.317 s, System: 0.045 s]
  Range (min … max):    1.336 s …  1.429 s    10 runs

This is the result after I commented out the ranges::copy line:

  Time (mean ± σ):     768.4 ms ±  15.4 ms    [User: 729.7 ms, System: 37.6 ms]
  Range (min … max):   752.9 ms … 793.7 ms    10 runs

Obviously the result is not cached.

I suspected that maybe GCC simply forgot to handle constexpr lambdas. So I changed primes_num_vector to a normal named function like below and benchmarked it again.

constexpr auto primes_num_vector() {
  // same as before
}

It turned out that the benchmark results stayed the same.

In what case can GCC cache this constexpr function?

QuarticCat
  • 1,314
  • 6
  • 20
  • 1
    "Set the maximum level of **nested** evaluation depth". So I would say it is for function such as Fibonnacci (with naive **recursive** implementation). – Jarod42 Aug 26 '22 at 07:25

0 Answers0