2

In an algorithm that I want to implement, I need to repeatedly operate on a matrix that is formed by a set of rows taken from another existing matrix. For clarity, I will call "A" the original matrix, and "B" the "working matrix" (consisting in some rows of A).

The algorithm proceeds as follows:

  1. Select a row r from A
  2. Either insert or remove such row in B
  3. Perform some operations with B (if it is not empty)
  4. Repeat

Some relevant information:

  • If r has to be inserted in B, I am sure that it is not already there (no risk of creating duplicates)
  • If r has to be removed from B, I am sure that it is already there (no risk to remove non-existent rows)
  • The ordering of the rows inside B does not matter

My objective is to implement this part of the algorithm efficiently using Eigen. With efficiently, I mainly refer to execution time.

I came up with 3 different solutions, I would like to know what are your thoughts about them and what improvements could be added.

Solution 1: resize and copy

The idea is to store B in a MatrixXd instance that is repeatedly expanded/contracted. Since ordering of the rows does not matter, I stack new rows at the bottom of B. When I need to remove a row from B, I simply copy lower rows upwards.

class Algorithm1 {
public:
  bool show_b;

  Algorithm1(const Eigen::MatrixXd& M) : A(M), B(0,M.cols()), show_b(false) {
    rows.reserve(A.rows());
  }

  void run(const std::vector<std::pair<bool,int>>& ops) {
    for(const auto& op : ops) {
      if(rows.size() > 0) {
        // If B is not empty, do something with it (print is just an example)
        if(show_b)
          std::cout << "-------------------" << std::endl << "B is:" << std::endl << B << std::endl;
      }

      // check if we have to insert or remove the row
      if(op.first) {
        // Row 'op.second' should be added to B.
        B.conservativeResize(rows.size()+1, A.cols()); // add space for a column at the bottom
        B.row(rows.size()) = A.row(op.second); // copy the desired row from A to B
        rows.push_back(op.second); // keep track of which rows are inside B
      }
      else {
        // Row 'op.second' should be removed from B.
        auto it = std::find(rows.begin(), rows.end(), op.second); // find where the row is stored in B
        int r = std::distance(rows.begin(), it); // get the actual index
        for(int i=r; i<rows.size()-1; i++)
          B.row(i) = B.row(i+1); // copy each row to the one above
        B.conservativeResize(rows.size()-1, A.cols()); // remove the last row of B
        rows.erase(it); // remove the row also from this list
      }
    }
  }

private:
  Eigen::MatrixXd A; // Original matrix
  Eigen::MatrixXd B; // Working matrix
  std::vector<int> rows; // list of rows in B, such that A.row(rows[i]) equals B.row(i)
};

Note that in principle the rows to be added/removed are selected internally by the algorithm. However, to keep it simpler, I manually pass a "list of operations" as a parameter to the algorithm. Each element of the std::vector is a pair whose first element tells if the row has to be added (pair.first is true) or removed (pair.first is false). The index of the row to be selected is given in pair.second (the index refers to the rows of A).

Solution 2: Eigen::Map

Another solution that I could think of is to store the data of B in a "raw buffer", which can be used in the main algorithm via a Eigen::Map. To be able to quickly add/remove an entire row from the buffer, I had to use row-major ordering for both A and B.

class Algorithm2 {
public:
  // Row-major matrix
  typedef Eigen::Matrix<double,Eigen::Dynamic,Eigen::Dynamic,Eigen::RowMajor> RMatrixXd;

  bool show_b;

  Algorithm2(const Eigen::MatrixXd& M) : A(M), show_b(false) {
    rows.reserve(A.rows());
    data.reserve(A.size());
  }

  void run(const std::vector<std::pair<bool,int>>& ops) {
    for(const auto& op : ops) {
      if(rows.size() > 0) {
        // Create the sub-matrix B using a Map
        Eigen::Map<RMatrixXd> B(data.data(), rows.size(), A.cols());

        // Do something with it (print is just an example)
        if(show_b)
          std::cout << "-------------------" << std::endl << "B is:" << std::endl << B << std::endl;
      }

      // check if we have to insert or remove the row
      if(op.first) {
        // Row 'op.second' should be added to B.
        // append the row content inside the data
        data.insert(
          data.end(), // append at the end
          A.data() + op.second*A.cols(), // from the beginning of the row
          A.data() + (op.second+1)*A.cols() // until the beginning of the next one
        );
        rows.push_back(op.second); // keep track of which rows are inside B
      }
      else {
        // Row 'op.second' should be removed from B.
        auto it = std::find(rows.begin(), rows.end(), op.second); // find where the row is stored in B
        int r = std::distance(rows.begin(), it); // get the actual index
        data.erase(data.begin()+r*A.cols(), data.begin()+(r+1)*A.cols()); // remove the corresponding values from the data
        rows.erase(it); // remove the row also from this list
      }
    }
  }

private:
  RMatrixXd A; // The original matrix
  std::vector<int> rows; // list of rows in B, such that A.row(rows[i]) equals B.row(i)
  std::vector<double> data; // content of B (row-major ordering)
};

Solution 3: new overload of operator()

Using the latest version of Eigen, it is possible to select a sub-matrix directly via operator(). In practice, the matrix B can be obtained as A( rows-to-select , Eigen::all ).

class Algorithm3 {
public:
  bool show_b;

  Algorithm3(const Eigen::MatrixXd& M) : A(M), show_b(false) {
    rows.reserve(A.rows());
  }

  void run(const std::vector<std::pair<bool,int>>& ops) {
    for(const auto& op : ops) {
      if(rows.size() > 0) {
        // Create the sub-matrix B
        auto B = A(rows, Eigen::all);

        // Do something with it (print is just an example)
        if(show_b)
          std::cout << "-------------------" << std::endl << "B is:" << std::endl << B << std::endl;
      }

      // check if we have to insert or remove the row
      if(op.first) {
        // Row 'op.second' should be added to B.
        rows.push_back(op.second); // keep track of which rows are inside B
      }
      else {
        // Row 'op.second' should be removed from B.
        rows.erase(std::find(rows.begin(), rows.end(), op.second)); // remove the row from the list
      }
    }
  }

private:
  Eigen::MatrixXd A; // The original matrix
  std::vector<int> rows; // list of rows in B, such that A.row(rows[i]) equals B.row(i)
};

Quick comparison of the solutions

In the following there is a simple program to record execution times:

int main() {

  Eigen::MatrixXd A(3,3);
  A << 0,0,0,1,1,1,2,2,2;

  std::vector<std::pair<bool,int>> ops {
    // {add=true/remove=false, row-index} // which rows will now be inside B
    {true, 1}, // 1
    {true, 0}, // 0 1
    {true, 2}, // 0 1 2
    {false,0}, // 1 2
    {false,2}, // 1
    {true, 2}, // 1 2
    {false,1}, // 2
    {true, 0}, // 0 2
    {true, 1}, // 0 1 2
    {false,0}, // 1 2
    {false,1}, // 2
    {false,2}, // empty
  };

  // Instanciate
  Algorithm1 alg1(A);
  Algorithm2 alg2(A);
  Algorithm3 alg3(A);

  // warm-up
  for(int i=0; i<1000; i++) {
    alg1.run(ops);
    alg2.run(ops);
    alg3.run(ops);
  }

  // run the 3 algorithms
  auto start1 = std::chrono::high_resolution_clock::now();
  for(int i=0; i<1000; i++)
    alg1.run(ops);
  auto end1 = std::chrono::high_resolution_clock::now();

  auto start2 = std::chrono::high_resolution_clock::now();
  for(int i=0; i<1000; i++)
    alg2.run(ops);
  auto end2 = std::chrono::high_resolution_clock::now();

  auto start3 = std::chrono::high_resolution_clock::now();
  for(int i=0; i<1000; i++)
    alg3.run(ops);
  auto end3 = std::chrono::high_resolution_clock::now();

  // show the results
  std::cout << "Alg1 took " << std::chrono::duration_cast<std::chrono::nanoseconds>(end1-start1).count() << "ns" << std::endl;
  std::cout << "Alg2 took " << std::chrono::duration_cast<std::chrono::nanoseconds>(end2-start2).count() << "ns" << std::endl;
  std::cout << "Alg3 took " << std::chrono::duration_cast<std::chrono::nanoseconds>(end3-start3).count() << "ns" << std::endl;

  return 0;
}

On my laptop, the best results are given by the 2nd strategy, followed by the 3rd one (~3.4 times longer) and the 1st one (~4.3 times slower).

What do you think? Do the results make sense compared to your experience? Can you suggest any better strategy?

ffusco
  • 68
  • 5

0 Answers0