C++: Performance of Loop in Expression Template

Question

I'm using Expression Templates in a vector-like class for transformations such as moving averages. Here, different to standard arithmetic operations, the operator[](size_t i) does not make a single single acces to element i, but rather there is a whole loop which needs to be evaluated, e.g. for the moving average of a vector v

double operator[](size_t i) const
{
    double ad=0.0;
    for(int j=i-period+1; j<=i; ++j)
       ad+=v[j];
    return ad/period;
}

(thats not the real function because one must care for non-negative indices, but that doesn't matter now).

In using such a moving average construct, I have the fear that the code becomes rather in-performant, especially if one takes a double- or triple-moving average. Then one obtains nested loops and therefore quadratic or cubic scaling with the period-size.

My question is, whether compilers are so smart to optimize such redundant loops somehow away? Or is that not the case and one must manually take care for intermediate storage (--which is what I guess)? How could one do this reasonably in the example code below?

Example code, adapted from Wikipedia, compiles with Visual Studio 2013:

CRTP-base class and actual vector:

#include <vector>

template <typename E>
struct VecExpression
{
    double operator[](int i) const { return static_cast<E const&>(*this)[i]; }
};

struct Vec : public VecExpression<Vec>
{
    Vec(size_t N) : data(N) {}
    double operator[](int i) const { return data[i]; }
    double& operator[](int i) { return data[i]; }

    std::vector<double> data;
};

Moving Average class:

template <typename VectorType>
struct VecMovingAverage : public VecExpression<VecMovingAverage<VectorType> >
{
    VecMovingAverage(VectorType const& _vector, int _period) : vector(_vector), period(_period) {}
    double operator[](int i) const
    {
        int s = std::max(i - period + 1, 0);
        double ad = 0.0;
        for (int j = s; j <= i; ++j)
            ad += vector[j];
        return ad / (i - s + 1);
    }

    VectorType const& vector;
    int period;
};

template<typename VectorType>
auto MovingAverage(VectorType const& vector, int period = 10) -> VecMovingAverage<VectorType>
{
    return VecMovingAverage<VectorType>(vector, period);
}

Now my above mentioned fears arise with expressions like this,

Vec vec(100);
auto tripleMA= MovingAverage(MovingAverage(MovingAverage(vec,20),20),20);
std::cout << tripleMA[40] << std::endl;

which I suppose require 20^3 evaluations for the single operator[] ... ?

EDIT: One obvious solution is to store the result. Move std::vector<double> data into the base class, then change the Moving Average class to something like (untested)

template <typename VectorType, bool Store>
struct VecMovingAverage : public VecExpression<VecMovingAverage<VectorType, Store> >
{
    VecMovingAverage(VectorType const& _vector, int _period) : vector(_vector), period(_period) {}
    double operator[](int i) const
    {
        if(Store && i<data.size())
        {
            return data[i];
        }
        else
        {
           int s = std::max(i - period + 1, 0);
           double ad = 0.0;
           for (int j = s; j <= i; ++j)
              ad += vector[j];
           ad /= (i - s + 1)

           if(Store)
           {
               data.resize(i+1);
               data[i]=ad;
           }
           return ad;
        }
    }

    VectorType const& vector;
    int period;
};

In the function one can then choose to store the result:

template<typename VectorType>
auto MovingAverage(VectorType const& vector, int period = 10) -> VecMovingAverage<VectorType>
{
    static const bool Store=true;
    return VecMovingAverage<VectorType, Store>(vector, period);
}

This could be extended such that storage is applied only for multiple applications, etc.

If you want something that's more assuredly optimizable, use `std::accumulate`. — chris, Aug 30 '14 at 20:28
Thanks, but using `std::accumulate` is as far as I see only possible when the container is filled, i.e., in the code above only for the `Vec` class. For the `VecTransform` class, there is no container to iterate over, but only a special `operator[]`. — davidhigh, Aug 30 '14 at 21:55
reproducible example would help, rather than (pseudo)code fragments. — Rusan Kax, Aug 30 '14 at 22:29

C++: Performance of Loop in Expression Template

0 Answers0