29

I have an integer N which I know at compile time. I also have an std::array holding integers describing the shape of an N-dimensional array. I want to generate nested loops, as described bellow, at compile time, using metaprogramming techniques.

constexpr int N {4};
constexpr std::array<int, N> shape {{1,3,5,2}};


auto f = [/* accept object which uses coords */] (auto... coords) { 
     // do sth with coords
}; 

// This is what I want to generate.
for(int i = 0; i < shape[0]; i++) {
     for(int j = 0; j < shape[1]; j++) {
          for(int k = 0; k < shape[2]; k++) {
                for(int l = 0; l < shape[3]; l++) {
                    f(i,j,k,l) // object is modified via the lambda function.
                }
          }
     }
}

Note the parameter N is known at compile time but might change unpredictably between compilations, hence I can't hard code the loops as above. Ideally the loop generation mechanism will provide an interface which accepts the lambda function, generates the loops and calls the function producing the equivalent code as above. I am aware that one can write an equivalent loop at runtime with a single while loop and an array of indices, and there are answers to this question already. I am, however, not interested in this solution. I am also not interested in solutions involving preprocessor magic.

Teodor Nikolov
  • 783
  • 7
  • 14
  • 1
    Isn't your compiler unrolling the loops when optimization is turned on ? – Arunmu Jul 15 '16 at 10:25
  • 1
    He might very well unroll some loops but that is not the point. I am asking about loop generation not loop unrolling. Anyhow, the array can hold values which are much larger than the integers specified above. Unrolling the loops completely will not be possible in general. – Teodor Nikolov Jul 15 '16 at 10:28
  • 1
    Use a single loop and compute `i`, `j`, `k` and `l`, etc., from the single loop index. – Cheers and hth. - Alf Jul 15 '16 at 11:17
  • @TeodorNikolov: Using the indices at compile time ***is*** loop unrolling. – Cheers and hth. - Alf Jul 15 '16 at 11:18
  • 1
    @Cheersandhth.-Alf In this case it's not unrolling. Here caller creates an instance of the wrapping structure, which in an ordinary runtime loop calls a member function of the next instance and so on. So while instancing templates, compiler does not unroll loops. It rather recursively calculates template parameters for deeper instances. Godboldt's online disassembler confirms this point: https://godbolt.org/g/4xGzoB. On the other hand, if you add `-O2` (not even `-O3`!), all computations will be performed in compile-time and the code boils down to single `std::cout.operator<<()` call. – Sergey Jul 15 '16 at 12:54
  • @Sergey: It's not the compiler that unrolls loops, it's your code. It's a manual loop unrolling. – Cheers and hth. - Alf Jul 15 '16 at 13:05
  • 1
    @Sergey: I wanted to post concrete code for my comment above (using a single loop) and it is simpler, but what I cooked up has `i` varying fastest and `l` slowest, and I have no time to fix it. Or I don't feel I have. Maybe you can fix it for me and post it? At http://coliru.stacked-crooked.com/a/e073a4a40bc7a15e – Cheers and hth. - Alf Jul 15 '16 at 13:09
  • @Cheersandhth.-Alf I got your point. Your code indeed should be unrolled when fixed, unlike mine. I'll try to complete it a bit later. Thanks for your effort! – Sergey Jul 15 '16 at 13:37
  • [Here](https://github.com/tomilov/variant/blob/master/test/include/test/visit.hpp) the code of testing of perfect forwarding for visitor and visitable for, say, coming `std::variant` (and others). It enumerate (at compile time if possible) all possible combinations of referenceness, constness for all objects involved. – Tomilov Anatoliy Jul 19 '16 at 08:15

4 Answers4

26

Something like this (NOTE: I take the "shape" as a variadic template argument set..)

#include <iostream>

template <int I, int ...N>
struct Looper{
    template <typename F, typename ...X>
    constexpr void operator()(F& f, X... x) {
        for (int i = 0; i < I; ++i) {
            Looper<N...>()(f, x..., i);
        }
    }
};

template <int I>
struct Looper<I>{
    template <typename F, typename ...X>
    constexpr void operator()(F& f, X... x) {
        for (int i = 0; i < I; ++i) {
            f(x..., i);
        }
    }
};

int main()
{
    int v = 0;
    auto f = [&](int i, int j, int k, int l) {
        v += i + j + k + l;
    };

    Looper<1, 3, 5, 2>()(f);

    auto g = [&](int i) {
        v += i;
    };

    Looper<5>()(g);

    std::cout << v << std::endl;
}
Nim
  • 33,299
  • 2
  • 62
  • 101
  • Doesn't this require that `f` is `constexpr`? – Cheers and hth. - Alf Jul 15 '16 at 11:43
  • @Cheersandhth.-Alf - erm, don't think so.. stick the `cout` in the lambda, should still work fine... – Nim Jul 15 '16 at 12:32
  • @Nim - nice solution; +1; the only defect I see is that you have to develop a lambda function (`f()`, `g()`, etc.) for every `N`; I don't now ho to solve in in general, but if you can use C++14 and the lamda must do a simple operation like the sum of the arguments, you can use a variadic lambda like `auto f = [](auto i, auto ... is) { auto ret = i; char unused[] { ((ret += is), '0')... }; return ret; };` – max66 Jul 15 '16 at 12:39
  • @Nim: I meant, in order for it to generate the calls *at compile time*. Otherwise this added complexity is not necessary. – Cheers and hth. - Alf Jul 15 '16 at 13:16
  • 2
    @Cheersandhth.-Alf, my understanding (which ofcourse could be broken) is that the op wants to generate an arbitrary dimension loop where the number of dimensions is known but variable at compile time and the range is also known at compile time, however the function that needs to be called is arbitrary. So the constexpr is moot, it's the construct which generates the arbitrary dimension loop without having to explicitly hand code the individual loops for all dimensions which the OP is after? – Nim Jul 15 '16 at 13:54
4

Assuming you don't want total loop unrolling, just generation of i, j, k etc. argument tuples for f:

#include <stdio.h>
#include <utility>      // std::integer_sequence

template< int dim >
constexpr auto item_size_at()
    -> int
{ return ::shape[dim + 1]*item_size_at<dim + 1>(); }

template<> constexpr auto item_size_at<::N-1>() -> int { return 1; }

template< size_t... dim >
void call_f( int i, std::index_sequence<dim...> )
{
    f( (i/item_size_at<dim>() % ::shape[dim])... );
}

auto main()
    -> int
{
    int const n_items = ::shape[0]*item_size_at<0>();
    for( int i = 0; i < n_items; ++i )
    {
        call_f( i, std::make_index_sequence<::N>() );
    }
}
Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
3

I suppose this is exactly what you asked for:

#include <array>
#include <iostream>

constexpr int N{4};
constexpr std::array<int, N> shape {{1,3,5,2}};

// Diagnositcs

template<typename V, typename ...Vals>
struct TPrintf {
        constexpr static void call(V v, Vals ...vals) {
                std::cout << v << " ";
                TPrintf<Vals...>::call(vals...);
        }
};

template<typename V>
struct TPrintf<V> {
        constexpr static void call(V v) {
                std::cout << v << std::endl;
        }
};


template<typename ...Vals>
constexpr void t_printf(Vals ...vals) {
        TPrintf<Vals...>::call(vals...);
}

// Unroll

template<int CtIdx, typename F>
struct NestedLoops {
        template<typename ...RtIdx>
        constexpr static void call(const F& f, RtIdx ...idx) {
                for(int i = 0; i < shape[CtIdx]; ++i) {
                        NestedLoops<CtIdx + 1, F>::call(f, idx..., i);
                }
        }
};

template<typename F>
struct NestedLoops<N-1, F> {
        template<typename ...RtIdx>
        constexpr static void call(const F& f, RtIdx ...idx) {
                for(int i = 0; i < shape[N-1]; ++i) {
                        f(idx..., i);
                }
        }
};

template<typename F>
void nested_loops(const F& f) {
        NestedLoops<0, F>::call(f);
}

int main()
{
        auto lf = [](int i, int j, int k, int l) {
                t_printf(i,j,k,l);
        };

        nested_loops(lf);
        return 0;
}
Sergey
  • 7,985
  • 4
  • 48
  • 80
2

Another variant of the same thing:

template <size_t shape_index, size_t shape_size>
struct Looper
{
    template <typename Functor>
    void operator()(const std::array<int, shape_size>& shape, Functor functor)
    {
        for (int index = 0; index < shape[shape_index]; ++index)
        {
            Looper<shape_index + 1, shape_size>()
                (
                    shape,
                    [index, &functor](auto... tail){ functor(index, tail...); }
                );
        }
    }
};

template <size_t shape_size>
struct Looper<shape_size, shape_size>
{
    template <typename Functor>
    void operator()(const std::array<int, shape_size>&, Functor functor)
    {
        functor();
    }
};

template <size_t shape_size, typename Functor>
void loop(const std::array<int, shape_size>& shape, Functor functor)
{
    Looper<0, shape_size>()(shape, functor);
}

Example of use:

constexpr size_t N {4};

constexpr std::array<int, N> shape {{1,3,5,2}};

void f(int i, int j, int k, int l)
{
    std::cout
        << std::setw(5) << i
        << std::setw(5) << j
        << std::setw(5) << k
        << std::setw(5) << l
        << std::endl;
}

// ...

loop(shape, f);

Live demo

Constructor
  • 7,273
  • 2
  • 24
  • 66