20

I have a ~3000x3000 covariance-alike matrix on which I compute the eigenvalue-eigenvector decomposition (it's a OpenCV matrix, and I use cv::eigen() to get the job done).

However, I actually only need the, say, first 30 eigenvalues/vectors, I don't care about the rest. Theoretically, this should allow to speed up the computation significantly, right? I mean, that means it has 2970 eigenvectors less that need to be computed.

Which C++ library will allow me to do that? Please note that OpenCV's eigen() method does have the parameters for that, but the documentation says they are ignored, and I tested it myself, they are indeed ignored :D

UPDATE: I managed to do it with ARPACK. I managed to compile it for windows, and even to use it. The results look promising, an illustration can be seen in this toy example:

#include "ardsmat.h"
#include "ardssym.h"
int     n = 3;           // Dimension of the problem.
    double* EigVal = NULL;  // Eigenvalues.
    double* EigVec = NULL; // Eigenvectors stored sequentially.


    int lowerHalfElementCount = (n*n+n) / 2;
    //whole matrix:
    /*
    2  3  8
    3  9  -7
    8  -7 19
    */
    double* lower = new double[lowerHalfElementCount]; //lower half of the matrix
    //to be filled with COLUMN major (i.e. one column after the other, always starting from the diagonal element)
    lower[0] = 2; lower[1] = 3; lower[2] = 8; lower[3] = 9; lower[4] = -7; lower[5] = 19;
    //params: dimensions (i.e. width/height), array with values of the lower or upper half (sequentially, row major), 'L' or 'U' for upper or lower
    ARdsSymMatrix<double> mat(n, lower, 'L');

    // Defining the eigenvalue problem.
    int noOfEigVecValues = 2;
    //int maxIterations = 50000000;
    //ARluSymStdEig<double> dprob(noOfEigVecValues, mat, "LM", 0, 0.5, maxIterations);
    ARluSymStdEig<double> dprob(noOfEigVecValues, mat);

    // Finding eigenvalues and eigenvectors.

    int converged = dprob.EigenValVectors(EigVec, EigVal);
    for (int eigValIdx = 0; eigValIdx < noOfEigVecValues; eigValIdx++) {
        std::cout << "Eigenvalue: " << EigVal[eigValIdx] << "\nEigenvector: ";

        for (int i = 0; i < n; i++) {
            int idx = n*eigValIdx+i;
            std::cout << EigVec[idx] << " ";
        }
        std::cout << std::endl;
    }

The results are:

9.4298, 24.24059

for the eigenvalues, and

-0.523207, -0.83446237, -0.17299346
0.273269, -0.356554, 0.893416

for the 2 eigenvectors respectively (one eigenvector per row) The code fails to find 3 eigenvectors (it can only find 1-2 in this case, an assert() makes sure of that, but well, that's not a problem).

MShekow
  • 1,526
  • 2
  • 14
  • 28
  • 2
    By the 'first 30 eigenvalues/vectors', do you mean the eigenvalues with largest moduli, largest real parts, or something else? After googling, it looks like [SLEPc](http://www.grycap.upv.es/slepc/) might have what you are looking for. – James Jun 16 '12 at 15:29
  • I'm looking for the 30 eigenvectors that correspond to the largest 30 real eigenvalues, resulting from a eigen-decomposition of a real, symmetric matrix. – MShekow Jun 16 '12 at 15:31
  • 2
    I'd use ARPACK for this. You'll get your 30 eigenvectors instantly. – David Heffernan Jun 16 '12 at 15:41
  • Which particular implementation do you recommend? I need the thing to run on windows (I'm willing to compile it myself of course, if that's necessary). – MShekow Jun 16 '12 at 15:52
  • Use a websearch to find out about ARPACK. – David Heffernan Jun 16 '12 at 16:03
  • I did, ARPACK seems to be very "hip and fresh" (sarcasm included xD). I'm rather wondering whether there is any *modern* library that can do what I need, like eigen (I looked at their API, it does not seem they support to limit the no of eigenvectors to be determined). – MShekow Jun 16 '12 at 16:19
  • Ah, I thought you wanted performance. If you want something shiny, then ARPACK is not for you. It's good enough for MATLAB mind you. I don't know how you concluded that ARPACK won't extract a subsets of eigenvectors. You got that wrong. – David Heffernan Jun 16 '12 at 16:21
  • 5
    "Theoretically, this should allow to speed up the computation significantly, right?" That depends on the algorithm that's used, it's at least not trivially true. But yes, there are algorithms that allow fast calculation of only eigenvectors in some eigenvalue range. Arnoldi/Lanczos being the most prominent, so ARPACK is kind of canonical. That it's so old doesn't mean it's bad, it's certainly great if you really want performance; but I agree that these Fortran libraries are rather horrible to use. – leftaroundabout Jun 16 '12 at 16:30
  • 2
    Ah, I think you mean that eigen doesn't do what you want rather than ARPACK. Sorry. Anyway, all I can say is that I was formerly using a direct algo that could take many minutes and often consume all system memory to produce any results at all. With ARPACK I can get the first few hundred eigen pairs in a second or two without needed a lot of system resources. – David Heffernan Jun 16 '12 at 16:36
  • Yes it has a horrid interface, and I had to tweak it hard to make it threadsafe. But it is certainly excellent from an algo standpoint. It's fantastically high quality. Just a bit tricky to use. If you are not a strong programmer then you'd be best with something simpler to use. – David Heffernan Jun 16 '12 at 16:36
  • @DavidHeffernan yes I indeed meant "eigen" (the library) is not able to specify the no. of vectors. Given that I'm not a C++ expert, I'm very sure that the time it takes me to understand and use ARPACK is a _LOT_ higher than just waiting for the few hundred matrix decompositions I need to do for my experiments. Still, if anyone knows a more "high-level" (C/C++) library with my mentioned requirements, please shoot :) – MShekow Jun 16 '12 at 16:48
  • I might investigate why OpenCV 2.3.1 does NOT say (in its documentation) that the low and high index parameters are ignored (instead they explain their meaning), while 2.4.1 does. – MShekow Jun 16 '12 at 16:56
  • Turns out OpenCV 2.3 also did not use the two parameters. Anyways, I created a new topic in which I try to get ARPACK++ to work in Windows, see http://stackoverflow.com/questions/11071998/using-arpack-on-windows-with-visual-studio – MShekow Jun 17 '12 at 14:23
  • Updated the original post, it's finally working... :) – MShekow Jun 17 '12 at 17:55
  • 10
    @NameZero912, if you have the answer and nobody has provided it, make an answer yourself and accept it. It will get this question off the list of unanswered questions. :) – Yakk - Adam Nevraumont Dec 24 '12 at 04:10
  • @NameZero912: Pinging for you to do Yakk's suggestion. Please edit out the answer in your question and post it on its own. – GManNickG Jan 16 '13 at 01:40
  • Hi, Did you compare ARPACK and Intel MKL ssyevx? – Glenn Yu Jan 28 '13 at 19:18

2 Answers2

1

In this article, Simon Funk shows a simple, effective way to estimate a singular value decomposition (SVD) of a very large matrix. In his case, the matrix is sparse, with dimensions: 17,000 x 500,000.

Now, looking here, describes how eigenvalue decomposition closely related to SVD. Thus, you might benefit from considering a modified version of Simon Funk's approach, especially if your matrix is sparse. Furthermore, your matrix is not only square but also symmetric (if that is what you mean by covariance-like), which likely leads to additional simplification.

... Just an idea :)

Reinderien
  • 11,755
  • 5
  • 49
  • 77
arr_sea
  • 841
  • 10
  • 16
0

It seems that Spectra will do the job with good performances.

Here is an example from their documentation to compute the 3 first eigen values of a dense symmetric matrix M (likewise your covariance matrix):

#include <Eigen/Core>
#include <Spectra/SymEigsSolver.h>
// <Spectra/MatOp/DenseSymMatProd.h> is implicitly included
#include <iostream>
using namespace Spectra;
int main()
{
    // We are going to calculate the eigenvalues of M
    Eigen::MatrixXd A = Eigen::MatrixXd::Random(10, 10);
    Eigen::MatrixXd M = A + A.transpose();
    // Construct matrix operation object using the wrapper class DenseSymMatProd
    DenseSymMatProd<double> op(M);
    // Construct eigen solver object, requesting the largest three eigenvalues
    SymEigsSolver< double, LARGEST_ALGE, DenseSymMatProd<double> > eigs(&op, 3, 6);
    // Initialize and compute
    eigs.init();
    int nconv = eigs.compute();
    // Retrieve results
    Eigen::VectorXd evalues;
    if(eigs.info() == SUCCESSFUL)
        evalues = eigs.eigenvalues();
    std::cout << "Eigenvalues found:\n" << evalues << std::endl;
    return 0;
}
Cryckx
  • 659
  • 6
  • 18