7

I have n real number variables (don't know, don't really care), let's call them X[n]. I also have m >> n relationships between them let's call them R[m], of the form:

X[i] = alpha*X[j], alpha is a nonzero positive real number, i and j are distinct but the (i, j) pair is not necessarily unique (i.e. there can be two relationships between the same variables with a different alpha factor)

What I'm trying to do is find a set of alpha parameters that solve the overdetermined system in some least squares sense. The ideal solution would be to minimize the squared sum of differences between each equation parameter and it's chosen value, but I'm satisfied with the following approximation:

If I turn the m equations into an overdetermined system of n unknowns, any pseudo-inverse based numeric solver will give me the obvious solution (all zeroes). So what I currently do is add another equation into the mix, x[0] = 1 (actually any constant will do) and solve the generated system in the least squares sense using the Moore-Penrose pseudo-inverse. While this tries to minimize the sum of (x[0] - 1)^2 and the square sum of x[i] - alpha*x[j], I find it a good and numerically stable approximation to my problem. Here is an example:

a = 1
a = 2*b
b = 3*c
a = 5*c

in Octave:

A = [
  1  0  0;
  1 -2  0;
  0  1 -3;
  1  0 -5;
]

B = [1; 0; 0; 0]

C = pinv(A) * B or better yet:
C = pinv(A)(:,1)

Which yields the values for a, b, c: [0.99383; 0.51235; 0.19136] Which gives me the following (reasonable) relationships:

a = 1.9398*b
b = 2.6774*c
a = 5.1935*c

So right now I need to implement this in C / C++ / Java, and I have the following questions:

Is there a faster method to solve my problem, or am I on the right track with generating the overdetermined system and computing the pseudo-inverse?

My current solution requires a singular value decomposition and three matrix multiplications, which is a little much considering m can be 5000 or even 10000. Are there faster ways to compute the pseudo-inverse (actually, I only need the first column of it, not the entire matrix given that B is zero except for the first row) given the sparsity of the matrix (each row contains exactly two non-zero values, one of which is always one and the other is always negative)

What math libraries would you suggest to use for this? Is LAPACK ok?

I'm also open to any other suggestions, provided that they are numerically stable and asymptotically fast (let's say k*n^2, where k can be large).

Radu Dan
  • 73
  • 1
  • 5

2 Answers2

4

Your problem is ill-posed. If you treat the problem as a function of n variables, (The least square of the difference), the function has exactly ONE global minimum hyperplane.

That global minimum will always contain zero unless you fix one of the variables to be nonzero, or reduce the function domain in some other way.

If what you want is a paramaterization of the solution hyperplane, you can get that from the Moore-Penrose Pseudo-Inverse http://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_pseudoinverse and check the section on obtaining all solutions.

(Please note i've used the word "hyperplane" in a technically incorrect manner. I mean a "flat" unbounded subset of your parameter space... A line, a plane, something that can be paramaterized by one or more vectors. For some reason i can't find the general noun for such objects)

DanielOfTaebl
  • 713
  • 5
  • 13
  • You are right. I believe an additional condition that I forgot to mention is that all x[] values are strictly positive real numbers. – Radu Dan Aug 19 '11 at 09:44
  • Doesn't change anything. Either your solution hyperplane will intersect with the set of all-positive reals, in which case you're happy, or it won't in which case the solution you are looking for is the limit as you approach zero. – DanielOfTaebl Aug 19 '11 at 09:57
  • If I understand you correctly (which at this point I'm not entirely sure that I do), you're saying that there are an infinite number of hyperplanes that satisfy the overdetermined system, and their relative error approaches zero as the x[] parameters approach zero. If that is the case, I agree with you completely, but you probably missed the fact that the x[] parameters don't really matter, only their relative ratios (and thus, that they are not null). It is the squared sum of their ratios minus the input ratios that I'm trying to minimize. – Radu Dan Aug 19 '11 at 10:18
  • Okay, i think i understand what you're getting at, but here is your central problem: If you multiply all your variables by "t" your Least-Square Difference will increase by t squared. In your current formulation, this will ultimately dominate anything you try. Do you want to reformulate the problem as a function on the hypersphere, IE impose the condition that x[1]^2 + x[2]^2 + x[3]^2 .... = 1? – DanielOfTaebl Aug 19 '11 at 11:34
  • Are you certain that the two least-square differences are uncorrelated? From testing with random constants (a = 1, 2, 1000), the least-square pseudo-inverse solution as provided by matlab remained the same, only scaled by a different factor. As for a hypersphere function, I don't know if that's in my best interest (this is where my math expertise stops); I am trying to determine a good (hopefully optimal in some error-minimizing sense) set of parameters from my given (possibly inexact or contradictory) inputs. – Radu Dan Aug 19 '11 at 11:47
  • At this point it's probably easiest to ask for some context. If you triple the constant, you get nine times the least-squares difference. But if you change which variable the constant is, the ratio relation will not hold. Well, it might hold, i haven't figured out if Moore-Penrose is correct for the spherical case. – DanielOfTaebl Aug 19 '11 at 12:06
  • I have 1000 (n) video cards. I also have some 10000 (m) relationships between them extracted from various benchmarks (GeForce 560TI is 25% faster than a regular GeForce 560). Some relationships are contradictory (Geforce 560 is 10% faster than GeForce 560TI), while others are just slightly different (560TI = 120% of 560). The final purpose is to sort all the video cards with aggregated benchmark data, and this is why I need the relative performance ratios between them. The least squares applies to the quotients, not the values themselves (relative values have no meaning by themselves anyway) – Radu Dan Aug 19 '11 at 12:15
  • As for the triple constant, nine times least square difference, I can't say that it's of great concern to me as long as the solution with the least square difference remains the same (minus scaling of course). – Radu Dan Aug 19 '11 at 12:23
  • Okay, well i should probably stop working on this and do my work. But i'll post something more in the evening. – DanielOfTaebl Aug 19 '11 at 12:41
3

The SVD approach is numerically very stable but not very fast. If you use SVD, then LAPACK is a good library to use. If it's just a one-off computation, then it's probably fast enough.

If you need a substantially faster algorithm, you might have to sacrifice stability. One possibility would be to use the QR factorization. You'll have to read up on this to see the details, but part of the reasoning goes as follows. If AP = QR (where P is a permutation matrix, Q is an orthogonal matrix, and R is a triangular matrix) is the economy QR-decomposition of A, then the equation AX = B becomes Q R P^{-1} X = B and the solution is X = P R^{-1} Q^T B. The following Octave code illustrates this using the same A and B as in your code.

[Q,R,P] = qr(A,0)
C(P) = R \ (Q' * B)

The nice thing about this is that you can exploit the sparsity of A by doing a sparse QR decomposition. There is some explanation in the Octave help for the qr function but it did not work for me immediately.

Even faster (but also even less stable) is to use the normal equations: If A X = B then A^T A X = A^T B. The matrix A^T A is a square matrix of (hopefully) full rank, so you can use any solver for linear equations. Octave code:

C = (A' * A) \ (A' * B)

Again, sparsity can be exploited in this approach. There are many methods and libraries for solving sparse linear systems; a popular one seems to be UMFPACK.

Added later: I don't know enough about this field to quantify. Whole books have been written on this. Perhaps QR is about a factor 3 or 5 faster SVD and normal equations twice as fast again. The effect on the numerical stability depends on your matrix A. Sparse algorithms can be much faster (say a factor of m), but their computational cost and numerical stability depend very much on the problem, in ways that are sometimes not well understood.

In your use case, my recommendation would be to try computing the solution with the SVD, see how long it takes, and if that is acceptable then just use that (I guess it would be about a minute for n=1000 and m=10000). If you want to study it further, try also QR and normal equations and see how much faster they are and how accurate; if they give approximately the same solution as SVD then you can be pretty confident they are accurate enough for your purposes. Only if these are all too slow and you are willing to sink some time into it, look at sparse algorithms.

Jitse Niesen
  • 4,492
  • 26
  • 23
  • In the end I went with the normal equations for the ease of implementation. Accuracy is acceptable and implemented in CUDA, I'm getting excellent speed (a few seconds on my GF 560 Ti). Thanks for all the info! – Radu Dan Aug 28 '11 at 16:13