I am implementing Latent Dirichlet Allocation (LDA) in Rcpp. In LDA, we need to deal with a huge sparse matrix (e.g. 50 x 3000).
I decided to use SparseMatrix in Eigen. However, since I need access to each cell, computationally expensive .coeffRef
slows down my function a lot.
Is there any way to use SparseMatrix while keeping the speed?
What I want to do has four steps,
- I know which cell (i,j) I want to access.
- I want to know whether the cell (i,j) is 0 or not.
- If the cell (i,j) is not 0, I want to know its value.
- After doing some analysis with the value in step 2 and 3, I want to update the cell (i,j). In this step, I might need to update the cell (i,j) which originally has 0.
#include <iostream>
#include <Eigen/dense>
#include <Eigen/Sparse>
using namespace std;
using namespace Eigen;
typedef Eigen::Triplet<double> T;
int main(){
Eigen::SparseMatrix<double> spmat;
// Insert in spmat
vector<T> tripletList;
int value;
tripletList.push_back(T(0,1,1));
tripletList.push_back(T(0,3,2));
tripletList.push_back(T(1,5,3));
tripletList.push_back(T(2,4,4));
tripletList.push_back(T(4,1,5));
tripletList.push_back(T(4,5,6));
spmat.resize(5,7); // define size
spmat.setFromTriplets(tripletList.begin(), tripletList.end());
for(int i=0; i<5; i++){ // I am accessing all cells just to clarify I need to access cell
for(int j=0; j<7; j++){
// Check if (i,j) is 0
if(spmat.coeffRef(i,j) != 0){
// Some analysis
value = spmat.coeffRef(i,j)*2; // just an example, more complex in the model
}
spmat.coeffRef(i,j) += value; // update (i,j)
}
}
cout << spmat << endl;
return 0;
}
Since the number of rows is much smaller than the columns, I considered accessing a column and then check the row value, but I couldn't handle SparseMatrix<double>::InnerIterator it(spmat, colid)
.