In an algorithm that I want to implement, I need to repeatedly operate on a matrix that is formed by a set of rows taken from another existing matrix. For clarity, I will call "A
" the original matrix, and "B
" the "working matrix" (consisting in some rows of A
).
The algorithm proceeds as follows:
- Select a row
r
fromA
- Either insert or remove such row in
B
- Perform some operations with
B
(if it is not empty) - Repeat
Some relevant information:
- If
r
has to be inserted inB
, I am sure that it is not already there (no risk of creating duplicates) - If
r
has to be removed fromB
, I am sure that it is already there (no risk to remove non-existent rows) - The ordering of the rows inside
B
does not matter
My objective is to implement this part of the algorithm efficiently using Eigen. With efficiently, I mainly refer to execution time.
I came up with 3 different solutions, I would like to know what are your thoughts about them and what improvements could be added.
Solution 1: resize and copy
The idea is to store B
in a MatrixXd
instance that is repeatedly expanded/contracted. Since ordering of the rows does not matter, I stack new rows at the bottom of B
. When I need to remove a row from B
, I simply copy lower rows upwards.
class Algorithm1 {
public:
bool show_b;
Algorithm1(const Eigen::MatrixXd& M) : A(M), B(0,M.cols()), show_b(false) {
rows.reserve(A.rows());
}
void run(const std::vector<std::pair<bool,int>>& ops) {
for(const auto& op : ops) {
if(rows.size() > 0) {
// If B is not empty, do something with it (print is just an example)
if(show_b)
std::cout << "-------------------" << std::endl << "B is:" << std::endl << B << std::endl;
}
// check if we have to insert or remove the row
if(op.first) {
// Row 'op.second' should be added to B.
B.conservativeResize(rows.size()+1, A.cols()); // add space for a column at the bottom
B.row(rows.size()) = A.row(op.second); // copy the desired row from A to B
rows.push_back(op.second); // keep track of which rows are inside B
}
else {
// Row 'op.second' should be removed from B.
auto it = std::find(rows.begin(), rows.end(), op.second); // find where the row is stored in B
int r = std::distance(rows.begin(), it); // get the actual index
for(int i=r; i<rows.size()-1; i++)
B.row(i) = B.row(i+1); // copy each row to the one above
B.conservativeResize(rows.size()-1, A.cols()); // remove the last row of B
rows.erase(it); // remove the row also from this list
}
}
}
private:
Eigen::MatrixXd A; // Original matrix
Eigen::MatrixXd B; // Working matrix
std::vector<int> rows; // list of rows in B, such that A.row(rows[i]) equals B.row(i)
};
Note that in principle the rows to be added/removed are selected internally by the algorithm. However, to keep it simpler, I manually pass a "list of operations" as a parameter to the algorithm. Each element of the std::vector
is a pair
whose first element tells if the row has to be added (pair.first
is true
) or removed (pair.first
is false
). The index of the row to be selected is given in pair.second
(the index refers to the rows of A
).
Solution 2: Eigen::Map
Another solution that I could think of is to store the data of B
in a "raw buffer", which can be used in the main algorithm via a Eigen::Map
. To be able to quickly add/remove an entire row
from the buffer, I had to use row-major ordering for both A
and B
.
class Algorithm2 {
public:
// Row-major matrix
typedef Eigen::Matrix<double,Eigen::Dynamic,Eigen::Dynamic,Eigen::RowMajor> RMatrixXd;
bool show_b;
Algorithm2(const Eigen::MatrixXd& M) : A(M), show_b(false) {
rows.reserve(A.rows());
data.reserve(A.size());
}
void run(const std::vector<std::pair<bool,int>>& ops) {
for(const auto& op : ops) {
if(rows.size() > 0) {
// Create the sub-matrix B using a Map
Eigen::Map<RMatrixXd> B(data.data(), rows.size(), A.cols());
// Do something with it (print is just an example)
if(show_b)
std::cout << "-------------------" << std::endl << "B is:" << std::endl << B << std::endl;
}
// check if we have to insert or remove the row
if(op.first) {
// Row 'op.second' should be added to B.
// append the row content inside the data
data.insert(
data.end(), // append at the end
A.data() + op.second*A.cols(), // from the beginning of the row
A.data() + (op.second+1)*A.cols() // until the beginning of the next one
);
rows.push_back(op.second); // keep track of which rows are inside B
}
else {
// Row 'op.second' should be removed from B.
auto it = std::find(rows.begin(), rows.end(), op.second); // find where the row is stored in B
int r = std::distance(rows.begin(), it); // get the actual index
data.erase(data.begin()+r*A.cols(), data.begin()+(r+1)*A.cols()); // remove the corresponding values from the data
rows.erase(it); // remove the row also from this list
}
}
}
private:
RMatrixXd A; // The original matrix
std::vector<int> rows; // list of rows in B, such that A.row(rows[i]) equals B.row(i)
std::vector<double> data; // content of B (row-major ordering)
};
Solution 3: new overload of operator()
Using the latest version of Eigen, it is possible to select a sub-matrix directly via operator()
. In practice, the matrix B
can be obtained as A( rows-to-select , Eigen::all )
.
class Algorithm3 {
public:
bool show_b;
Algorithm3(const Eigen::MatrixXd& M) : A(M), show_b(false) {
rows.reserve(A.rows());
}
void run(const std::vector<std::pair<bool,int>>& ops) {
for(const auto& op : ops) {
if(rows.size() > 0) {
// Create the sub-matrix B
auto B = A(rows, Eigen::all);
// Do something with it (print is just an example)
if(show_b)
std::cout << "-------------------" << std::endl << "B is:" << std::endl << B << std::endl;
}
// check if we have to insert or remove the row
if(op.first) {
// Row 'op.second' should be added to B.
rows.push_back(op.second); // keep track of which rows are inside B
}
else {
// Row 'op.second' should be removed from B.
rows.erase(std::find(rows.begin(), rows.end(), op.second)); // remove the row from the list
}
}
}
private:
Eigen::MatrixXd A; // The original matrix
std::vector<int> rows; // list of rows in B, such that A.row(rows[i]) equals B.row(i)
};
Quick comparison of the solutions
In the following there is a simple program to record execution times:
int main() {
Eigen::MatrixXd A(3,3);
A << 0,0,0,1,1,1,2,2,2;
std::vector<std::pair<bool,int>> ops {
// {add=true/remove=false, row-index} // which rows will now be inside B
{true, 1}, // 1
{true, 0}, // 0 1
{true, 2}, // 0 1 2
{false,0}, // 1 2
{false,2}, // 1
{true, 2}, // 1 2
{false,1}, // 2
{true, 0}, // 0 2
{true, 1}, // 0 1 2
{false,0}, // 1 2
{false,1}, // 2
{false,2}, // empty
};
// Instanciate
Algorithm1 alg1(A);
Algorithm2 alg2(A);
Algorithm3 alg3(A);
// warm-up
for(int i=0; i<1000; i++) {
alg1.run(ops);
alg2.run(ops);
alg3.run(ops);
}
// run the 3 algorithms
auto start1 = std::chrono::high_resolution_clock::now();
for(int i=0; i<1000; i++)
alg1.run(ops);
auto end1 = std::chrono::high_resolution_clock::now();
auto start2 = std::chrono::high_resolution_clock::now();
for(int i=0; i<1000; i++)
alg2.run(ops);
auto end2 = std::chrono::high_resolution_clock::now();
auto start3 = std::chrono::high_resolution_clock::now();
for(int i=0; i<1000; i++)
alg3.run(ops);
auto end3 = std::chrono::high_resolution_clock::now();
// show the results
std::cout << "Alg1 took " << std::chrono::duration_cast<std::chrono::nanoseconds>(end1-start1).count() << "ns" << std::endl;
std::cout << "Alg2 took " << std::chrono::duration_cast<std::chrono::nanoseconds>(end2-start2).count() << "ns" << std::endl;
std::cout << "Alg3 took " << std::chrono::duration_cast<std::chrono::nanoseconds>(end3-start3).count() << "ns" << std::endl;
return 0;
}
On my laptop, the best results are given by the 2nd strategy, followed by the 3rd one (~3.4 times longer) and the 1st one (~4.3 times slower).
What do you think? Do the results make sense compared to your experience? Can you suggest any better strategy?