1

if I use the member function of Eigen Matrix3Xf matrices myMatrix.middleCols(a, b) with a = 0, b = myMatrix.cols()-1, I get a performance penalty. Of course I usually use other values for a and b, but with these values, it's easiest to compare to the normal matrices.

Is this normal behaviour? Is this the case, because alignment cannot be ensured and thus no vectorization is possible? I did not find anything about it in the docs.

Here's an example code:

Matrix3Xf a_full = Matrix3Xf::Random(3, 400);
Vector3f v = Vector3f::Random();
RowVectorXf b_full = RowVectorXf::Random(400);

volatile int left = 0, right = 399;
auto& a = a_full.middleCols(left, right);
auto& b = b_full.middleCols(left, right);
//auto& a = a_full;
//auto& b = b_full;

for (float f = 0; f < 1000000; f++)
{
    b += (v.transpose() * a);
}

cout << b.sum();

With this code I get 8.6s execution time. Having a = a_full; and b = b_ful; uncommented, the execution time is 7.8s

yar
  • 1,855
  • 13
  • 26
  • 1
    I assume you mean `myMatrix.cols() - 1`. Can you supply a full [MCVE]? – Avi Ginsburg Feb 25 '19 at 05:01
  • 1
    `myMatrix.middleRows(0, myMatrix.cols())` does not make sense for a `Matrix3Xf`. – ggael Feb 25 '19 at 07:33
  • Did you compile with `-O2 -DNDEBUG`? – chtz Feb 25 '19 at 08:10
  • @AviGinsburg yes, I edited it @ggael true, but it makes comparison easier. As I said, I stumbled obove it, when using just slightly smaller matrices than the whole matrix. Say `.middleRows(10, myMatrix.cols()-10)` @chtz I compile with visual studio at the moment and build as Release – yar Feb 25 '19 at 09:29
  • 1
    Do you actually mean `middleCols(10, myMatrix.cols()-10)`? Your expression is undefined behavior (and it would assert when compiled without `-DNDEBUG`) – chtz Feb 25 '19 at 09:37
  • Also, the more important question is, what are you doing with that expression? If you are not using it at all, it should get optimized away entirely. If you do something which benefits from alignment, you may indeed get slower code (depending on your architecture, and also the Eigen version you are using). – chtz Feb 25 '19 at 09:40
  • @chtz I'm sorry, I corrected the text and made it clearer. I use middleCols, not middleRows. I do some operations with it that profit from vectorization (addition, multiplication with constants, multiplication with a vector). I'm using Eigen 3.3.7. With middleCols you still have contiguous memory, so I did not expect a performance penalty. – yar Feb 25 '19 at 12:12
  • 1
    I think my comment about showing a [MCVE] still stands. There would be fewer questions about what you mean. – Avi Ginsburg Feb 25 '19 at 12:14
  • @AviGinsburg I added it now – yar Feb 25 '19 at 12:42

1 Answers1

2

About multiplying with a constant: https://godbolt.org/z/a_OEEP. You do have some overhead, because Eigen can't know if your columns start at an aligned position, so it multiplies some values until it reaches an aligned position. (Additionally, there is a cleanup-loop at the end). If the number of columns is relatively small, this may have a significant impact.

Also, MSVC is sometimes often bad at inlining trivial functions. A lot of that is fixed in the development-branch (default), but not in 3.3.x, by adding more forced-inlines.

chtz
  • 17,329
  • 4
  • 26
  • 56