12

I'm trying to write a spectral clustering algorithm using NumPy/SciPy for larger (but still tractable) systems, making use of SciPy's sparse linear algebra library. Unfortunately, I'm running into stability issues with eigsh().

Here's my code:

import numpy as np
import scipy.sparse
import scipy.sparse.linalg as SLA
import sklearn.utils.graph as graph

W = self._sparse_rbf_kernel(self.X_, self.datashape)
D = scipy.sparse.csc_matrix(np.diag(np.array(W.sum(axis = 0))[0]))
L = graph.graph_laplacian(W) # D - W
vals, vects = SLA.eigsh(L, k = self.k, M = D, which = 'SM', sigma = 0, maxiter = 1000)

The sklearn library refers to the scikit-learn package, specifically this method for calculating a graph laplacian from a sparse SciPy matrix.

_sparse_rbf_kernel is a method I wrote to compute pairwise affinities of the data points. It operates by creating a sparse affinity matrix from image data, specifically by only computing pairwise affinities for the 8-neighborhoods around each pixel (instead of pairwise for all pixels with scikit-learn's rbf_kernel method, which for the record doesn't fix this either).

Since the laplacian is unnormalized, I'm looking for the smallest eigenvalues and corresponding eigenvectors of the system. I understand that ARPACK is ill-suited for finding small eigenvalues, but I'm trying to use shift-invert to find these values and am still not having much success.

With the above arguments (specifically, sigma = 0), I get the following error:

RuntimeError: Factor is exactly singular

With sigma = 0.001, I get a different error:

scipy.sparse.linalg.eigen.arpack.arpack.ArpackNoConvergence: ARPACK error -1: No convergence (1001 iterations, 0/5 eigenvectors converged)

I've tried all three different values for mode with the same result. Any suggestions for using the SciPy sparse library for finding small eigenvalues of a large system?

Igor F.
  • 2,649
  • 2
  • 31
  • 39
Magsol
  • 4,640
  • 11
  • 46
  • 68

1 Answers1

16

You should use which='LM': in the shift-invert mode, this parameter refers to the transformed eigenvalues. (As explained in the documentation.)

pv.
  • 33,875
  • 8
  • 55
  • 49
  • 1
    Right, but the problem is I want the smallest eigenvalues and their associated eigenvectors, not the largest. The documentation explains that if small eigenvalues are desired, shift-invert mode should be used to improve performance, which is what I was trying to do by using `sigma`, to no avail. – Magsol Aug 31 '12 at 15:04
  • 5
    The above should give you exactly what you want. Note: with sigma=0, the transformed eigenvalues are w'=1/(w-sigma)=1/w. Therefore, with `which='LM'` you get the eigenvalues with large **w'**, or in other words, the smallest eigenvalues of the original problem. If you use the shift-invert mode, you need to adjust also the `which` parameter accordingly. – pv. Aug 31 '12 at 20:32
  • 2
    Just wanted to say this was extremely helpful – YXD Mar 29 '13 at 00:14