constexpr depth limit with clang (fconstexpr-depth doesnt seem to work)

Question

Is there anyway to configure constexpr instantiation depth? I am running with -fconstexpr-depth=4096 (using clang/XCode).

But still fail to compile this code with error: Constexpr variable fib_1 must be initialized by a constant expression. The code fails irrespective of whether option -fconstexpr-depth=4096 is set or not.

Is this a bug with clang or is expected to behave this way. Note: this works good till fib_cxpr(26), 27 is when it starts to fail.

Code:

constexpr int fib_cxpr(int idx) {
    return idx == 0 ? 0 :
           idx == 1 ? 1 :
           fib_cxpr(idx-1) + fib_cxpr(idx-2); 
}

int main() {
    constexpr auto fib_1 = fib_cxpr(27);
    return 0; 
}

@Brian: thanks. Somehow I wasn't able to format it correctly. — Sarang, Jul 05 '14 at 23:47
On the other hand, Clang is perfectly happy with a traditional template-style metaprogram. — Kerrek SB, Jul 05 '14 at 23:51

bames53 · Accepted Answer · 2015-03-30T01:11:45.543

TL;DR:

For clang you want the command line argument -fconstexpr-steps=1271242 and you do not need more than -fconstexpr-depth=27

The recursive method of calculating Fibonacci numbers does not require very much recursion depth. The depth required for fib(n) is in fact no more than n. This is because the longest chain of calls is through the fib(i-1) recursive call.

constexpr auto fib_1 = fib_cxpr(3); // fails with -fconstexpr-depth=2, works with -fconstexpr-depth=3
constexpr auto fib_1 = fib_cxpr(4); // fails with -fconstexpr-depth=3, works with -fconstexpr-depth=4

So we can conclude that -fconstexpr-depth is not the setting that matters.

Furthermore, the error messages also indicate a difference:

constexpr auto fib_1 = fib_cxpr(27);

Compiled with -fconstexpr-depth=26, to be sure we hit that limit, clang produces the message:

note: constexpr evaluation exceeded maximum depth of 26 calls

But compiling with -fconstexpr-depth=27, which is enough depth, produces the message:

note: constexpr evaluation hit maximum step limit; possible infinite loop?

So we know that clang is distinguishing between two failures: recursion depth and 'step limit'.

The top Google results for 'clang maximum step limit' lead to pages about the clang patch implementing this feature, including the implementation of the command-line option: -fconstexpr-steps. Further Googling of this option indicates that there's no user-level documentation.

So there's no documentation about what clang counts as a 'step' or how many 'steps' clang requires for fib(27). We could just set this really high, but I think that's a bad idea. Instead some experimentation shows:

n : steps
0 : 2
1 : 2
2 : 6
3 : 10
4 : 18

Which indicates that steps(fib(n)) == steps(fib(n-1)) + steps(fib(n-2)) + 2. A bit of calculation shows that, according to this, fib(27) should require 1,271,242 of clang's steps. So compiling with -fconstexpr-steps=1271242 should allow the program to compile, which indeed it does. Compiling with -fconstexpr-steps=1271241 results in an error the same as before, so we know we have an exact limit.

An alternative, less exact method involves observing from the patch that the default step limit is 1,048,576 (2²⁰), which is obviously sufficient for fib(26). Intuitively, doubling that should be plenty, and from the earlier analysis we know that two million is plenty. A tight limit would be ⌈φ · steps(fib(26))⌉ (which does happen to be exactly 1,271,242).

Another thing to note is that these results clearly show that clang is not doing any memoization of constexpr evaluation. GCC does, but it appears that this is not implemented in clang at all. Although memoization increases the memory requirements it can sometimes, as in this case, vastly reduce the time required for evaluation. The two conclusions I draw from this are that writing constexpr code that requires memoization for good compile times is not a good idea for portable code, and that clang could be improved with support for constexpr memoization and a command line option to enable/disable it.

@BenVoigt Yeah, I guess it's a little funny but it makes sense. (and you can see in the compiled example I linked to that I just modified a copy of the original function to calculate the number of steps, instead of working out the closed form or doing anything similarly smart.) — bames53, Jul 06 '14 at 01:14
@bames53: thanks for detailed explanation. I guess I missed expanding the semantic issue reported. XCode gave a very good explanation of how it went wrong.. — Sarang, Jul 06 '14 at 01:50
also seems like step limit was added as part of C++1y extensions: http://llvm.org/klaus/clang/commit/e7565635002ce0daaaf4b714cdb472507af462ee/ — Sarang, Jul 06 '14 at 01:52
@Sarang Yeah, C++14 eliminated many of the restrictions on constexpr functions, so I guess clang added the step limit as a more general limit on resource usage. However `-fconstexpr-steps` is available in C++11 mode as well. — bames53, Jul 06 '14 at 02:01
I looked at the implementation in clang some time ago. The compiler interpret the AST on a constexpr function invocation. Steps is increase each time a node is evaluated. It means that the minimum step count to succeed depends a lot on the function body and the possible optimization applied to the AST. It was also surprising to observe that there was no cache (compared to a template recursive version running at linear time because of type cache). Anyway, a good Fibonacci implementation does not require recursion :) — galop1n, Jul 06 '14 at 11:27
@galop1n That seems to be true. When I rewrite `fib_cxpr` to use an if-else if-else block instead of the ternary operator, the minimum step count to succeed increases to 3056712. — Saeed Baig, May 27 '20 at 08:44

Chris Philip · Answer 2 · 2021-03-30T23:57:23.240

You can also refactor your Fibonacci algorithm to include explicit memoization which will work in clang.

// Copyright 2021 Google LLC.
// SPDX-License-Identifier: Apache-2.0

#include <iostream>

template <int idx>
constexpr int fib_cxpr();

// This constexpr template value acts as the explicit memoization for the fib_cxpr function.
template <int i>
constexpr int kFib = fib_cxpr<i>();

// Arguments cannot be used in constexpr contexts (like the if constexpr),
// so idx is refactored as a template value argument instead.
template <int idx>
constexpr int fib_cxpr() {
    if constexpr (idx == 0 || idx == 1) {
        return idx;
    } else {
        return kFib<idx-1> + kFib<idx-2>;
    }      
}

int main() {
    constexpr auto fib_1 = fib_cxpr<27>();
    std::cout << fib_1 << "\n";
    return 0; 
}

This version works for arbitrary inputs to fib_cxpr and compiles with only 4 steps. https://godbolt.org/z/9cvz3hbaE

This isn't directly answering the question but I apparently don't have enough reputation to add this as as comment...

A M · Answer 3 · 2021-04-17T12:09:51.367

Unrelated to "depth limit" but strongly related to Fibonacci number calculation.

Recursion is maybe the wrong approach and not needed.

There is a ultra fast solution with low memory footprint possible.

So, we could use a compile time pre calculation of all Fibonacci numbers that fit into a 64 bit value.

One important property of the Fibonacci series is that the values grow strongly exponential. So, all existing build in integer data types will overflow rather quick.

With Binet's formula you can calculate that the 93rd Fibonacci number is the last that will fit in a 64bit unsigned value.

And calculating 93 values during compilation is a really simple task.

We will first define the default approach for calculation a Fibonacci number as a constexpr function:

// Constexpr function to calculate the nth Fibonacci number
constexpr unsigned long long getFibonacciNumber(size_t index) noexcept {
    // Initialize first two even numbers 
    unsigned long long f1{ 0 }, f2{ 1 };

    // calculating Fibonacci value 
    while (index--) {
        // get next value of Fibonacci sequence 
        unsigned long long f3 = f2 + f1;
        // Move to next number
        f1 = f2;
        f2 = f3;
    }
    return f2;
}

With that, Fibonacci numbers can easily be calculated at compile time. Then, we fill a std::array with all Fibonacci numbers. We use also a constexpr and make it a template with a variadic parameter pack.

We use std::integer_sequence to create a Fibonacci number for indices 0,1,2,3,4,5, ....

That is straigtforward and not complicated:

template <size_t... ManyIndices>
constexpr auto generateArrayHelper(std::integer_sequence<size_t, ManyIndices...>) noexcept {
    return std::array<unsigned long long, sizeof...(ManyIndices)>{ { getFibonacciNumber(ManyIndices)... } };
};

This function will be fed with an integer sequence 0,1,2,3,4,... and return a std::array<unsigned long long, ...> with the corresponding Fibonacci numbers.

We know that we can store maximum 93 values. And therefore we make a next function, that will call the above with the integer sequence 1,2,3,4,...,92,93, like so:

constexpr auto generateArray() noexcept {
    return generateArrayHelper(std::make_integer_sequence<size_t, MaxIndexFor64BitValue>());
}

And now, finally,

constexpr auto FIB = generateArray();

will give us a compile-time std::array<unsigned long long, 93> with the name FIB containing all Fibonacci numbers. And if we need the i'th Fibonacci number, then we can simply write FIB[i]. There will be no calculation at runtime.

I do not think that there is a faster way to calculate the n'th Fibonacci number.

Please see the complete program below:

#include <iostream>
#include <array>
#include <utility>
// ----------------------------------------------------------------------
// All the following will be done during compile time

// Constexpr function to calculate the nth Fibonacci number
constexpr unsigned long long getFibonacciNumber(size_t index) {
    // Initialize first two even numbers 
    unsigned long long f1{ 0 }, f2{ 1 };

    // calculating Fibonacci value 
    while (index--) {
        // get next value of Fibonacci sequence 
        unsigned long long f3 = f2 + f1;
        // Move to next number
        f1 = f2;
        f2 = f3;
    }
    return f2;
}
// We will automatically build an array of Fibonacci numberscompile time
// Generate a std::array with n elements 
template <size_t... ManyIndices>
constexpr auto generateArrayHelper(std::integer_sequence<size_t, ManyIndices...>) noexcept {
    return std::array<unsigned long long, sizeof...(ManyIndices)>{ { getFibonacciNumber(ManyIndices)... } };
};

// Max index for Fibonaccis that for in an 64bit unsigned value (Binets formula)
constexpr size_t MaxIndexFor64BitValue = 93;

// Generate the required number of elements
constexpr auto generateArray()noexcept {
    return generateArrayHelper(std::make_integer_sequence<size_t, MaxIndexFor64BitValue>());
}

// This is an constexpr array of all Fibonacci numbers
constexpr auto FIB = generateArray();
// ----------------------------------------------------------------------

// Test
int main() {

    // Print all possible Fibonacci numbers
    for (size_t i{}; i < MaxIndexFor64BitValue; ++i)

        std::cout << i << "\t--> " << FIB[i] << '\n';

    return 0;
}

Developed and tested with Microsoft Visual Studio Community 2019, Version 16.8.2.

Additionally compiled and tested with clang11.0 and gcc10.2

Language: C++17

constexpr depth limit with clang (fconstexpr-depth doesnt seem to work)

3 Answers3

Linked