Parallel method to get all the eigenvalues of a large sparse matrix

Question

Is it possible to compute all the eigenvalues of a large sparse matrix using multiple CPUs ?
If yes, then is it possible to do it without storing the full dense matrix in memory ? (using only the stored sparse matrix)
If yes, then what's a good (rapid and low memory usage) method to do it ?
Can numpy or scipy do it ?

My matrix is complex, non-hermitian, as sparse as the identity matrix and of dimension N x N where N = BinomialCoefficient(L,Floor(L/2)) where we need to take L as large as possible.

For example, with L = 20, N = 184 756 it is 99.9995% sparse, having just N non-zero elements. So, the memory usage of the sparse matrix is ~0.1GB but would be ~10TB for the dense matrix. With L = 30, N = 155 117 520 and we use ~60GB (sparse) and ~10EB (dense). So it's impraticable to store the full dense matrix in memory.

I have access to Intel® Gold 6148 Skylake @ 2.4 [GHz] CPUs with up to 752 [GB] of RAM each. I could use Python, C (ScaLAPACK, OpenBLAS, MAGMA, ELPA, MUMPS, SuperLU, SuiteSparse, PETSc, Lis,...), C++ (Armadillo, Eigen, BLitz++, Trilinos,...), Matlab, R, Perl, Fortran, mpi4py, CUDA, Intel® Math Kernel Library, and a few other softwares.

I build my matrix using Python (scipy.sparse, numpy and multiprocessing). I've tried using numpy.linalg.eigvals() and scipy.linalg.eigvals(), but it seems that they only use the cores of one CPU. I could look further into those, but I wont if there's a better way to solve my matrix.

For the curious ones, my matrix comes from a representation of a non-hermitian operator on a subset of states of a length L quantum spin 1/2 chain with strong interactions. I need the full spectrum because it allows me to study the level spacing distribution of the energy spectrum for a fixed set of quantum numbers.

I'm far from being a professional in computer science, so if I missed some basic concept please be clement.

Would you mind, @Gab to add a few details - ***a)*** What are your current, as-is processing-times? ***b)*** How often do you experience solution-instabilities? ***c)*** What are your (im-)precision tresholds for the solution of eigenvalues? ***d)*** How large **L**-s are your final solution aiming at? ***e)*** What are your domain-of-expertise acceptable **processing-times** for computing that sized eigenvalue problems? ***f)*** What are your **time**-to-market expectations for having such final solution RTO? ***g)*** What are your **financing** restrictions for achieving all those above met? — user3666197, Jan 07 '20 at 13:36
a) I achieved L=17 with a less powerful build (CPU and RAM) in around 4h b) From L=4 to 17 I did not experience any solution-instabilities c) As I'm studying the spacing between eigenvalues, precision is important; I was aiming for something around 10^(-10) (my eigenvalues are around [0,40]) d) I'd like to reach at least L=22, but the more the better e) I'd say 20h f) I don't understand the question, I'm sorry g) I should do it with already available resources only. — Gab, Jan 08 '20 at 20:18
No, I did not know about spectralib, but I'll look into it, thanks! — Gab, Jan 08 '20 at 20:19
@Gab thx for the as-is state details. The point about precision is cardinal, standard tools have problems with both IEEE-floating points number representation precision & solver method's instabilities(more often iterative methods). For small L-s, in-cache computing gets fast on i-Gold-6148 (having above 27+ MB L3-cache,data-access keeps to happen at costs of units of **[ns]**) **L ~30+** gets you above **5 [GB]** of data representing the sparse-NxN matrix of complex values.This turns the processing from CPU-intensive (happily in-cache) into RAM-I/O intensive… **2+ orders of magnitude slower** — user3666197, Jan 08 '20 at 20:58
+ Not all solvers are ready to meet all of your requirements, those that can handle **sparse**-representations of matrix data need not be also ready for **complex** data-type,some can handle both,but only for symmetric or multi-band layouts,so you will spend lot of time to check the packages readiness for the particular use-case you need: **efficient** (in both TimeDOMAIN =fast & SpaceDOMAIN =low RAM or efficiently cluster-distributed sparse-representation of data) for **large** (costs of scaling go strongly against you ~ orders of magnitude slower for Out-of-Cache, the more for distributed ) — user3666197, Jan 08 '20 at 21:10
@Gab ( you may already be aware, that comments, that do not start with @-nickame do not get notified to other StackOverflow users, so making a comment for OneLyner will need to mention the @-OneLyner otherwise she/he will not get a notification about your comment for her/him ) — user3666197, Jan 08 '20 at 21:24
@user3666197 Thanks for all the info! So for the first question "Is it possible to compute all the eigenvalues of a large sparse matrix using multiple CPUs ?", the answer would be yes ? I've edited the question to try to make it simpler and clearer. — Gab, Jan 13 '20 at 19:01
@OneLyner It seems like Spectralib doesn't support complex type matrices. — Gab, Feb 10 '20 at 19:30
@Gab just tired scipy.sparse.linalg.eigs (based on https://www.caam.rice.edu/software/ARPACK/ it seems) on my machine and it's is using multiple CPUs, but not all of them all the time. Part of the computation uses only one, part of the computation go up to 8. — One Lyner, Feb 19 '20 at 16:32
@OneLyner Did you try to get all the eigenvalues with it? Because it can only get up to N-2 eigenvalues and behaves very badly (way worst than dense methods) when trying to get this many eigenvalues. (By default it computes only 6 eigenvalues) — Gab, Feb 20 '20 at 16:56

Parallel method to get all the eigenvalues of a large sparse matrix

0 Answers0