I am writing a general purpose library using Eigen for computational mechanics,
dealing mostly with 6x6 sized matrices and 6x1 sized vectors.
I consider using the Eigen::Ref<>
template to make it usable also for segments and blocks, as documented in http://eigen.tuxfamily.org/dox/TopicFunctionTakingEigenTypes.html and Correct usage of the Eigen::Ref<> class
However, a small performance comparison reveals that Eigen::Ref
has a considerable overhead for such small functions compared to standard c++ references:
#include <ctime>
#include <iostream>
#include "Eigen/Core"
Eigen::Matrix<double, 6, 6> testRef(const Eigen::Ref<const Eigen::Matrix<double, 6, 6>>& A)
{
Eigen::Matrix<double, 6, 6> temp = (A * A) * A;
temp.diagonal().setOnes();
return temp;
}
Eigen::Matrix<double, 6, 6> testNoRef(const Eigen::Matrix<double, 6, 6>& A)
{
Eigen::Matrix<double, 6, 6> temp = (A * A) * A;
temp.diagonal().setOnes();
return temp;
}
int main(){
using namespace std;
int cycles = 10000000;
Eigen::Matrix<double, 6, 6> testMat;
testMat = Eigen::Matrix<double, 6, 6>::Ones();
clock_t begin = clock();
for(int i = 0; i < cycles; i++)
testMat = testRef(testMat);
clock_t end = clock();
double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
std::cout << "Ref: " << elapsed_secs << std::endl;
begin = clock();
for(int i = 0; i < cycles; i++)
testMat = testNoRef(testMat);
end = clock();
elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
std::cout << "noRef : " << elapsed_secs << std::endl;
return 0;
}
Output with gcc -O3
:
Ref: 1.64066
noRef : 1.1281
So it seems that Eigen::Ref
has considerable overhead, at least in cases with low actual computational effort.
On the other hand, the approach using const Eigen::Matrix<double, 6, 6>& A
leads to unnecessary copies if blocks or segments are passed:
#include <Eigen/Core>
#include <iostream>
void test( const Eigen::Vector3d& a)
{
std::cout << "addr in function " << &a << std::endl;
}
int main () {
Eigen::Vector3d aa;
aa << 1,2,3;
std::cout << "addr outside function " << &aa << std::endl;
test ( aa ) ;
test ( aa.head(3) ) ;
return 0;
}
Output:
addr outside function 0x7fff85d75960
addr in function 0x7fff85d75960
addr in function 0x7fff85d75980
So this approach is excluded for the general case.
Alternatively, one could make function templates using Eigen::MatrixBase
, as described in the documentation. However, this seems to be inefficient for large libraries, and it cannot be adapted to fixed size matrices (6x6, 6x1) as in my case.
Is there any other alternative? What is the general recommendation for large general purpose libraries?
Thank you in advance!
edit: Modified the first benchmark example according to the recommendations in the comments