1

Recently I have been working with Eigen matrices derived from raw buffers and I noticed this curious case:

#include <eigen3/Eigen/Dense>

int main(int argc, char const *argv[]) {
    /* code */
    const int M = 320;
    const int N = 640;
    const int K = 320;
    const int alpha = 2;
    const int beta = 1;
    Eigen::Matrix<int32_t, Eigen::Dynamic,Eigen::Dynamic> A = Eigen::Matrix<int32_t, Eigen::Dynamic,Eigen::Dynamic>::Random(M,K);
    Eigen::Matrix<int32_t, Eigen::Dynamic,Eigen::Dynamic> B = Eigen::Matrix<int32_t, Eigen::Dynamic,Eigen::Dynamic>::Random(K,N);
    Eigen::Matrix<int32_t, Eigen::Dynamic,Eigen::Dynamic> C = Eigen::Matrix<int32_t, Eigen::Dynamic,Eigen::Dynamic>::Random(M,N);

    //Following http://eigen.tuxfamily.org/dox/TopicWritingEfficientProductExpression.html

    C.noalias() += (A*alpha)*(B*beta); //WORKS

    C.noalias() += A*B;

    Eigen::Map<Eigen::Matrix<int32_t, M, K, Eigen::ColMajor> > map_a(A.data());
    Eigen::Map<Eigen::Matrix<int32_t, K, N, Eigen::ColMajor> > map_b(B.data());
    Eigen::Map<Eigen::Matrix<int32_t, M, N, Eigen::ColMajor> > map_c(C.data());

    map_c.noalias() += map_a*map_b; //WORKS

    map_c.noalias() += (map_a*alpha)*(map_b*beta); //COMPILE ERROR HERE

    return 0;
}

If I have big matrix dimensions, I can't allocate on the stack or I would get a OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG, therefore I use the Eigen dynamic allocator.

However, it seems that if I have a raw buffer and I map it to a matrix, i can not perform a BLAS 3 like gemm multiplication (C+= (alpha*A)*(beta*B)), due to compilation error: OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG. If I do a simple C += A*B everything works as expected.

In the example case, I map the raw buffer from a matrix allocated by Eigen, but in principle it could be the raw buffer from anything (such as std::vector).

Any idea what is happening here? As far as I can tell everything here should be heap allocated, and even if it weren't, why would C += A*B work with the mapped memory matrices and C+= (alpha*A)*(beta*B) would not?

Cheers,

Nick

XapaJIaMnu
  • 1,408
  • 3
  • 12
  • 28
  • Looks like a bug in the expression-tree interpretation. The following works, but your expression should get evaluated the same way: `map_c.noalias() += (alpha*beta)*(map_a*map_b);` (all `(` `)` on the rhs are optional). – chtz Feb 18 '19 at 07:55
  • yes there is something weird happening, I'll check. Nonetheless, for such large matrices better use runtime sizes as in Avi Ginsburg's answer. – ggael Feb 18 '19 at 08:14
  • @chtz, I can't get your example to work with clang. I was under the impression that as long as the memory is mapped and I take care of its allocations (which are on the heap), Eigen shouldn't attempt to manage it or make any assumptions about it. – XapaJIaMnu Feb 18 '19 at 08:17
  • Works with clang and gcc using the dev-branch of Eigen (not with 3.3.4 apparently): https://godbolt.org/z/IiuCZU – chtz Feb 18 '19 at 08:44
  • @chtz, I tried clang and gcc, but I don't Eigen-dev. ggael explained what's going on though, which is great. – XapaJIaMnu Feb 18 '19 at 08:49

2 Answers2

3

For such large matrices better use runtime sizes as in Avi Ginsburg's answer. That being said, I'll now explain what's going on within Eigen. The problem is that within the matrix product implementation, we have a branch like that (simplified):

if(<too small>)
  lazyproduct::eval(dst, lhs, rhs);
else
  gemm::eval(dst,lhs, rhs);

If the product is too small, instead of calling the heavy "gemm" routine, we fall back to a coefficient-based implementation, in your case:

map_c.noalias() += (map_a*alpha).lazyProduct(map_b*beta);

This path does not rewrite the expression as (alpha*beta)*(map_a*map_b) and therefore, in order to avoid recomputing map_a*alpha and map_b*beta many times, the strategy is to backed them within temporaries... hence the compilation error.

Of course, in your case this path will never be taken, and it would even be completely removed by the compiler if you increase EIGEN_STACK_ALLOCATION_LIMIT because the condition if(<too small>) is known at compile-time. How sad.

ggael
  • 28,425
  • 2
  • 65
  • 71
  • But if map my memory buffers explicitly, are they still considered compile time provided sizes and therefore stack allocated? Thank you for the detailed reply and explanatory reply! To recap my understanding, because the matrix sizes are known at compile time, a branch in the code (that is not ever executed) gets generated, which causes the compiler warning. Is this fixable? Is it worth to open a bug report on Eigen or not? – XapaJIaMnu Feb 18 '19 at 08:43
  • If you specify compile-time sizes to your Map, then this information will be propagated and Eigen will allocate temporaries on the stack. No need to report a bug. You already reached the main devs;) – ggael Feb 18 '19 at 10:21
  • @ggael, thank you so much for the quick response to the issue and fixing it! – XapaJIaMnu Feb 18 '19 at 14:19
2

Your Maps are wrapping statically sized matrices, e.g.:

Eigen::Map<Eigen::Matrix<int32_t, M, K, Eigen::ColMajor> > 
                                  ^  ^

Use dynamically sized Maps instead:

Eigen::Map<Eigen::Matrix<int32_t, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor> > map_a(A.data(), M, K);
Eigen::Map<Eigen::Matrix<int32_t, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor> > map_b(B.data(), K, N);
Eigen::Map<Eigen::Matrix<int32_t, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor> > map_c(C.data(), M, N);

This doesn't mean that you can change resize the Maps, just indicates how temporaries end up being allocated.

Avi Ginsburg
  • 10,323
  • 3
  • 29
  • 56
  • Ok, marking the map as dynamic does work, BUT as far as I understood from Eigen documentation, as soon as you use a map, Eigen will no longer manage any memory or make any assumptions about it. Your comment about temporaries make some sense, but then why does `map_c.noalias() += map_a*map_b;` work and `map_c.noalias() += (map_a*alpha)*(map_b*beta)`? They should both be mapped to the same BLAS routine and require no temporary memory storage.. – XapaJIaMnu Feb 18 '19 at 08:23