6

I've recently been doing some performance evaluation of ranges & views. I've posted a simple example (also at https://www.godbolt.org/z/7ThxjKafc) where the difference in assembly is much more significant than I would have expected. With latest GCC & -O3,

  • the assembly for sum_array contains 31 instructions and 8 jumps.
  • the assembly for sum_vec contains 12 instructions and 2 jumps.

Given that the size of m_array is known at compile time, I would have expected near identical assembly for both functions. Should I expect the optimizing compiler to improve in future versions, or is there some fundamental limitation in how std::views::join is specified?

#include <array>
#include <vector>
#include <ranges>

struct Foo {
    auto join() const { return m_array | std::views::join; }
    auto direct() const { return std::views::all(m_array[0]); }
    std::array<std::vector<int*>, 1> m_array;
};
__attribute__((noinline)) int sum_array(const Foo& foo)
{
    int result = 0;
    for (int* val : foo.join())
        result += *val;
    return result;
}
__attribute__((noinline)) int sum_vec(const Foo& foo)
{
    int result = 0;
    for (int* val : foo.direct())
        result += *val;
    return result;
}
MarkB
  • 672
  • 2
  • 9

0 Answers0