8

As part of a synthetic noise generation algorithm, I have to construct on the fly a lot of large non-singular square matrices

a i,j (i,j:1..n) / ∀ (i,j) a i,j ∈ ℤ and 0 ≤ a i,j ≤ k and Det[a] ≠ 0

but the a i,j should also be random following a uniform distribution in [0, k].

In its current incarnation the problem has n ≅ 300, k≅ 100.

In Mathematica, I can generate random element matrices very fast, but the problem is that I must also check for singularity. I am currently using the Determinant value for this.

The problem is that this check, for the 300x300 matrices takes something near 2 seconds, and I can't afford that.

Of course I may construct the rows by selecting a random first row and then constructing successive orthogonal rows, but I'm not sure how to guarantee that those rows will have their elements following an uniform distribution in [0,k].

I am looking for a solution in Mathematica, but just a faster algorithm for generating the matrices is also welcome.

NB> The U[0,k] condition means that taken a set of matrices, each position (i , j) across the set should follow a uniform distribution.

Dr. belisarius
  • 60,527
  • 15
  • 115
  • 190
  • Maybe you should post this question in a Gap or Sage forum. You basically want random matrices in GL(n,ZZ). For example in Sage you could use `random_matrix(ZZ, n, n, algorithm='subspaces', rank=n)`, but for large `n` this is really slow... I'm sure the experts would be able to show you a better way. – Simon Feb 27 '11 at 08:30
  • @Simon I've a few alternatives. I am trying to avoid the problem by using random but Hermitian operators, but I've to prove some equivalence relations that are not easy. Also, I may go with your answer and see if the overall timing is good enough. And the GF(p) suggestion is also a possible way. That is all before going to Sage, which is one of my many areas of ignorance. – Dr. belisarius Feb 27 '11 at 08:42
  • 1
    You might find the article of Dana Randal useful http://www.eecs.berkeley.edu/Pubs/TechRpts/1991/CSD-91-658.pdf – Sasha Apr 23 '11 at 06:05
  • @Sasha Thanks! It seems much more efficient than the method I finally used. I'll keep the pointer in case I need to redo the filtering stage. – Dr. belisarius Apr 23 '11 at 06:22
  • 1
    For the record, Randall's paper is for nonsingular matrices over finite fields. A matrix of zeros and ones that has determinant "2" is singular over F_2, but nonsingular over the reals. – Rick Sep 15 '11 at 17:16

4 Answers4

5

In both Matlab and Octave, the determinant and LU factorization of a 500x500 matrix are basically instantaneous. Does Mathematica have a mode where it can call out to LAPACK or some similar library? You might need to annotate that your arrays should be treated as floating point numbers and not symbolically; that might make it a lot faster. As a point of comparison, LU on a 5000x5000 matrix takes 8.66 seconds on my system using Octave; 500x500 should be around 1000 times faster than that.

Jeremiah Willcock
  • 30,161
  • 7
  • 76
  • 78
  • 2
    @belisarius: If the matrices contain real numbers, there is probably no need to check for singularity (or near-singularity). The probability of a singular matrix being generated is basically 0 (it is exactly 0 in "actual" real numbers). The paper at http://mss3.libraries.rutgers.edu/dlr/TMP/rutgers-lib_25959-PDF-1.pdf (use Google cache) states that the probability of a 500x500 matrix whose elements are all either -1 or 1 being singular is very, very tiny. – Jeremiah Willcock Feb 27 '11 at 07:39
  • @Jeremiah Thanks for your answer. Please note that as the matrix elements are Integers, the probability of singularity is certainly not null. Also, the dimensions are just the current incarnation, I'll try to use smaller numbers in the future. I guess I could call LAPACK, but never did. I'll try to research about it. – Dr. belisarius Feb 27 '11 at 07:45
  • @belisarius: It seems like Mathematica will do the call automatically, as long as you make sure the elements of your matrix are floating point and not integers (that will likely trigger the calculation being done in exact rational arithmetic). – Jeremiah Willcock Feb 27 '11 at 07:46
  • @Jeremiah The problem is that I must be 100% sure that the matrix is non-singular, that is the reason I am using integer values. A rounding error will mean disaster :) – Dr. belisarius Feb 27 '11 at 07:49
  • @belisarius: What about http://techreports.lib.berkeley.edu/accessPages/CSD-91-658.html? They claim to be able to do that efficiently; note that you can just pick your finite field as Z_{the next prime above k}, and I believe that anything that is nonsingular in the field will also be nonsingular in the rationals. – Jeremiah Willcock Feb 27 '11 at 07:55
  • @JeremiahWillcock: Working in a finite field like that would discount matrices that aren't actually singular over the rationals. Would this affect the distribution much? – Simon Feb 27 '11 at 08:00
  • @Simon: Yes, that is true. I don't know how that would affect the distribution. I think we are concerned about corner cases anyway; I didn't really look through the rutgers.edu link I posted above, but it seems to suggest that the chances of getting a singular matrix are tiny, even when the set of values possible for elements is small. – Jeremiah Willcock Feb 27 '11 at 08:01
  • @Jeremiah The paper is really great. I'll re-think the problem to see if GF(p) is an option. I have some freedom to choose the max value, but never thought much about that since I believed the problem does not change by doing that. Now it does. – Dr. belisarius Feb 27 '11 at 08:08
  • @Simon You are right about the distribution reshaping. But perhaps I could define a new universe for the whole problem using GF(p). Have to think about it- Food for thought. – Dr. belisarius Feb 27 '11 at 08:14
  • If you just pick GF(2), anything non-singular in GF(2) is non-singular in rationals. Simply put, you are effectively examining det(A) mod 2. If det(A) mod 2 = 1, then det(A) is an odd integer and thus A is non-singular. if det(A) mod 2 = 0, then det(A) is an even integer, but 0 is the only even integer that we care about. det(A)= 2 or 4 or ... are perfectly fine but we'd throw them away using by just testing mod 2. This is true mod N for any N, but the larger N is the fewer false positives you'll discard. Perhaps checking mod 2 is fast enough and good enough? – President James K. Polk Feb 27 '11 at 18:04
  • @GregS The problem with that is being sure I am still getting a uniform distribution of the resulting elements over [0,k]. – Dr. belisarius Feb 27 '11 at 18:25
  • @belisarius: The distribution would not be exactly the same, since there are some matrices with determinant `k*p` (for some non-zero k, in the rationals) that have determinant `0` in the field and so are rejected. The question is really whether the distribution would be close enough for what you need. There are papers that say that singular matrices are very rare, even in fields, but those results are asymptotic, I believe, and so don't specify all of the constants. – Jeremiah Willcock Feb 27 '11 at 21:49
  • 1
    @belisarius: See http://www.inf.ethz.ch/personal/emo/PublFiles/RankSparseRandMatr_RSA10_97.pdf -- that work claims that the probability of getting a singular matrix over a finite field when generating using a particular variant of a uniform distribution is small, so even if you assume all singular matrices over the field are false positives, that will not change your distribution much. Also see the citations and footnotes in the intro of that paper for some more relevant facts. – Jeremiah Willcock Feb 27 '11 at 21:55
  • @Jeremiah That seems what I need to justify the GF(p) trick. Thanks a lot! – Dr. belisarius Feb 27 '11 at 22:13
5

You could use MatrixRank instead. On my machine it's about n/10 times faster for large nxn integer matrices.

Simon
  • 14,631
  • 4
  • 41
  • 101
  • n/10 on my machine too. That push my timing almost to the bearable frontier. Thanks! – Dr. belisarius Feb 27 '11 at 08:25
  • 1
    I don't get such a dramatic speed-up on my machine - `MatrixRank` there is about 10 times faster than `Det` for `500x500` matrices. If we use `MatrixRank[N@yourMatrix]`, this seems to still be reliable and speed things up about twice. One may also consider using CUDA - at least for some matrix operations (matrix multiplication for example), the speed-ups may be quite dramatic, for a powerful card such as CUDA Tesla C1060 or better. Additionally, if several cores are available, one can use `ParallelMap[MatrixRank, {m1,m2,...,mn}]`, where `n` is a number of cores, to trade memory for speed. – Leonid Shifrin Feb 27 '11 at 13:07
  • @Leonid Don't be shy, and post your insightful comments as answers, so future readers don't miss them. Everything inside the comment threads are supposed to be collateral. Regarding your N@ suggestion, the problem is that throwing a singular matrix into the set will spoil the set as a whole (think of each set as a basis for some subspace), and then the inter-set ranking, as from this point on I will only use numerical algorithms that are unable to detect ill conditioned elements. I'm prototyping now, and the parallel processing tricks will be used in the real scaled-up problem, if this works. – Dr. belisarius Feb 27 '11 at 15:34
  • @belisarius Ok, I got it. Initially it was just an extension of @Simon's answer (the `N` suggestion), and then I did not switch modes. The parallel part does work - it gives me `0.07` seconds for 6 matrices on my 6-core machine, which is exactly 6 times faster than a non-parallel version. It is pretty mindless to do, if you have a multicore machine, so you could as well try it while prototyping. Given your cirumstances, the CUDA suggestion probably won't work for you without reimplementing `Det` or `MatrixRank` for integer-valued matrices on CUDA. – Leonid Shifrin Feb 27 '11 at 16:26
3

If you use numeric approximate matrices in the singularity tests you will get much better speed.

k = 100; n = 500;
mat = RandomInteger[100, {n, n}];

AbsoluteTiming[Det[mat] == 0]

Out[57]= {6.8640000, False}

AbsoluteTiming[Det[N@mat] == 0.] (*warning light!!*)

Out[58]= {0.0230000, False}

AbsoluteTiming[MatrixRank[N@mat] != n]

Out[59]= {0.1710000, False}

Unfortunately that fastest test is not reliable. But the rank test should work well. Here is a quick example wherein we replace the last row by the sum of prior rows.

mat2 = mat;
mat2[[-1]] = Total[Most[mat]];

AbsoluteTiming[Det[mat2] == 0]

Out[70]= {9.4750000, True}

AbsoluteTiming[Det[N@mat2] == 0.]

Out[69]= {0.0470000, False}

AbsoluteTiming[MatrixRank[N@mat2] != n]

Out[71]= {0.1440000, True}

In principle I suppose there is a smallish chance that the rank test could give a false negative., say due to ill conditioning. As your usage will better tolerate false positives (that is, incorrect claims of singularity) you could instead test for singularity over a prime modulus. I think this was one of the recommendations others have made.

Continuing the above examples:

AbsoluteTiming[Det[mat, Modulus -> Prime[1000]]]

Out[77]= {0.6320000, 4054}

AbsoluteTiming[Det[mat2, Modulus -> Prime[1000]]]

Out[78]= {0.6470000, 0}

This is slow but faster than working over the rationals. For what it's worth, for most usage I'd be fairly confident in the outcomes of the faster testing via MatrixRank[N[matrix]].

Daniel Lichtblau Wolfram Research

Daniel Lichtblau
  • 6,854
  • 1
  • 23
  • 30
  • +1 Thanks Daniel. As for the "As your usage will better tolerate false positives", I have still to prove that this will not violate the Uniform Distribution for the filtered matrices. Seems true, but ... – Dr. belisarius Feb 27 '11 at 18:17
  • You could apply Last[SingularValueList[N[M]]] to examine the smallest singular value. You could then safely assume that M is non-singular if that value is sufficiently large, perhaps twice machine epsilon. If not, you could apply the more expensive test. – Mark McClure Feb 27 '11 at 18:25
  • @belisarius It won't violate uniformity of distribution, subject to the quality of the rng (which is pretty good). The idea is that your nonsingular matrices are equally likely selections in this distribution, because all matrices are equally likely and you only discard if singular. Similar to the method of picking unif random elements in a ball, by picking in a cube and discarding the selections not in the ball. – Daniel Lichtblau Feb 27 '11 at 21:35
  • @belisarius Sorry, I misunderstood. Your concern is in discarding nonsingular matrices that happen to be singular mod p. I gotta say, this is a VERY low probability event, sufficiently so that I doubt you can devise a random test to detect any discrepancy. Nonetheless, you can further validate singularity with a coupld more such tests modulo different primes. Now the failure probability is 1/(wonthappeninyourlifetime)^3. And this is cheap because you won't be getting failures in one such test, hence no need to test further. – Daniel Lichtblau Feb 27 '11 at 21:41
  • @Daniel Your last comment got the point. I think a formal demonstration is not so easy, but I may make some experiments to "prove" the uniform distribution. – Dr. belisarius Feb 27 '11 at 21:51
  • @belisarius If you require the test be faithful (pass only when nonsingular, fail only when singular) then you can do one of the quick tests modp, only testing over Q when mod p says it is singular. This is along the lines proposed by Mark McClure. If k, n are at all sizeable you should not see failures mod p, hence no slowdown from needing the more expensive test. But you will also know your code is bulletproof. – Daniel Lichtblau Feb 27 '11 at 22:11
  • @Daniel yes, I am going to go into those lines. Thanks a lot. – Dr. belisarius Feb 27 '11 at 22:16
1

Here's an expansion of a comment that I made a bit. I agree with Dan that it is highly improbable that the numerical version would return a false positive. Nonetheless, you can avoid that scenario by examining the singular values and returning False reliably, if the smallest singular value is larger than some error tolerance. (Admitedly, it might be a bit tricky to find a provable tolerance.) If the smallest singular value is uncomfortably small, you can apply Det to the integer matrix.

Here's a function that should quickly return False for most non-singular matrices. If the matrix is close to singular, a more expensive integer computation is performed.

singularQ[M_?MatrixQ] := If[
  Last[SingularValueList[N[M], Tolerance -> 0]] > 1/10.0^8,
 False, Det[M] == 0];

Here are 200 matrices that meet your description. One in the middle has been rigged to be singular.

SeedRandom[1];
matrices = RandomInteger[{0, 100}, {200, 300, 300}];
matrices[[100, -1]] = Sum[RandomInteger[{0, 10}]*matrices[[100, k]],
 {k, 1, Length[matrices[[100]]] - 1}];

Now, let's find the indices of all the singular matrices, watching as we go.

Flatten@Monitor[Table[
  If[singularQ[matrices[[k]]], k, {}],
  {k, 1, Length[matrices]}], k]
Mark McClure
  • 4,862
  • 21
  • 34
  • 1
    Checking rank of an approx numeric matrix is essentially doing this sort of thing by finding singular vals and counting the ones larger than some Tolerance times the largest, with Tolerance an option to the function. Or so I believe. – Daniel Lichtblau Feb 27 '11 at 22:15