I'm converting some of my own vector algebra code to use the optimized boost uBLAS library. However, when I tried to do a SymmetricMatrix-SparseVector multiplication I found it to be about 4x slower than my own implementation. The vector size is usually around 0-500 and about 70-80% entries are zero.
Here is my code
void CRoutines::GetA(double a[], double vectorIn[], int sparseVectorIndexes[], int vectorLength, int sparseLength)
{
compressed_vector<double> inVec (vectorLength, sparseLength);
for(int i = 0; i < sparseLength; i++)
{
inVec(sparseVectorIndexes[i]) = vectorIn[sparseVectorIndexes[i]];
}
vector<double> test = prod(inVec, matrix);
for(int i = 0; i < vectorLength; i++)
{
a[i] = test(i);
}
}
sparseVectorIndexes stores the indexes of the non-zero values of the input vector, vectorLength is the length of the vector, and sparseLength is the number of non-zeros in the vector. The matrix is stored as a symmetric matrix symmetric_matrix<double, lower>
.
My own implementation is a simple nested loop iteration where matrix is just a 2D double array:
void CRoutines::GetA(double a[], double vectorIn[], int sparseVectorIndexes[], int vectorLength, int sparseLength)
{
for (int i = 0; i < vectorLength; i++)
{
double temp = 0;
for (int j = 0; j < sparseLength; j++)
{
int row = sparseVectorIndexes[j];
if (row <= i) // Handle lower triangular sparseness
temp += matrix[i][row] * vectorIn[row];
else
temp += matrix[row][i] * vectorIn[row];
}
a[i] = temp;
}
}
Why is uBLAS 4x slower? Am I not writing the multiplication properly? Or is there another library more suited to this?
EDIT: If I use a dense vector array instead then uBLAS is only 2x slower...