LU Decomposition from Numerical Recipes not working; what am I doing wrong?

Question

I've literally copied and pasted from the supplied source code for Numerical Recipes for C for in-place LU Matrix Decomposition, problem is its not working.

I'm sure I'm doing something stupid but would appreciate anyone being able to point me in the right direction on this; I've been working on its all day and can't see what I'm doing wrong.

POST-ANSWER UPDATE: The project is finished and working. Thanks to everyone for their guidance.

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#define MAT1 3
#define TINY 1e-20

int h_NR_LU_decomp(float *a, int *indx){
  //Taken from Numerical Recipies for C
  int i,imax,j,k;
  float big,dum,sum,temp;
  int n=MAT1;

  float vv[MAT1];
  int d=1.0;
  //Loop over rows to get implicit scaling info
  for (i=0;i<n;i++) {
    big=0.0;
    for (j=0;j<n;j++)
      if ((temp=fabs(a[i*MAT1+j])) > big) 
        big=temp;
    if (big == 0.0) return -1; //Singular Matrix
    vv[i]=1.0/big;
  }
  //Outer kij loop
  for (j=0;j<n;j++) {
    for (i=0;i<j;i++) {
      sum=a[i*MAT1+j];
      for (k=0;k<i;k++) 
        sum -= a[i*MAT1+k]*a[k*MAT1+j];
      a[i*MAT1+j]=sum;
    }
    big=0.0;
    //search for largest pivot
    for (i=j;i<n;i++) {
      sum=a[i*MAT1+j];
      for (k=0;k<j;k++) sum -= a[i*MAT1+k]*a[k*MAT1+j];
      a[i*MAT1+j]=sum;
      if ((dum=vv[i]*fabs(sum)) >= big) {
        big=dum;
        imax=i;
      }
    }
    //Do we need to swap any rows?
    if (j != imax) {
      for (k=0;k<n;k++) {
        dum=a[imax*MAT1+k];
        a[imax*MAT1+k]=a[j*MAT1+k];
        a[j*MAT1+k]=dum;
      }
      d = -d;
      vv[imax]=vv[j];
    }
    indx[j]=imax;
    if (a[j*MAT1+j] == 0.0) a[j*MAT1+j]=TINY;
    for (k=j+1;k<n;k++) {
      dum=1.0/(a[j*MAT1+j]);
      for (i=j+1;i<n;i++) a[i*MAT1+j] *= dum;
    }
  }
  return 0;
}

void main(){
    //3x3 Matrix
    float exampleA[]={1,3,-2,3,5,6,2,4,3};
    //pivot array (not used currently)
    int* h_pivot = (int *)malloc(sizeof(int)*MAT1);
    int retval = h_NR_LU_decomp(&exampleA[0],h_pivot);
    for (unsigned int i=0; i<3; i++){
      printf("\n%d:",h_pivot[i]);
      for (unsigned int j=0;j<3; j++){
        printf("%.1lf,",exampleA[i*3+j]);
      }
    }
}

WolframAlpha says the answer should be

1,3,-2
2,-2,7
3,2,-2

I'm getting:

2,4,3
0.2,2,-2.8
0.8,1,6.5

And so far I have found at least 3 different versions of the 'same' algorithm, so I'm completely confused.

PS yes I know there are at least a dozen different libraries to do this, but I'm more interested in understanding what I'm doing wrong than the right answer.

PPS since in LU Decomposition the lower resultant matrix is unity, and using Crouts algorithm as (i think) implemented, array index access is still safe, both L and U can be superimposed on each other in-place; hence the single resultant matrix for this.

Ha, this is a pulled out main just for posting; you really didnt want the other thousand or so lines in that file, so pardon the non-rigor :D — Bolster, Apr 16 '11 at 22:41
According to my interpretation of what a "LU decomposition" is, the result is 2 matrices: a Lower triangular one and an Upper triangular another. Even the WolframAlpha page you link to reflects my interpretation. Where did you get those answers? — pmg, Apr 16 '11 at 22:53
Literally from the pages of the 'hallowed' Numerical Recipes, along with the associated source files and at least three versions online that look to do similar/the same thing. Google for 'LU Decomposition C' and there's plenty, but I can't get any of them working so am posting my attempt at the NR one. — Bolster, Apr 16 '11 at 23:00
Any chance that you're feeding column major data to a row major implementation or vice versa? Easy check would be transposing the matrix. — dmckee --- ex-moderator kitten, Apr 17 '11 at 00:48

score 10 · Accepted Answer · answered Apr 17 '11 at 01:43

10

I think there's something inherently wrong with your indices. They sometimes have unusual start and end values, and the outer loop over j instead of i makes me suspicious.

Before you ask anyone to examine your code, here are a few suggestions:

double-check your indices
get rid of those obfuscation attempts using sum
use a macro a(i,j) instead of a[i*MAT1+j]
write sub-functions instead of comments
remove unnecessary parts, isolating the erroneous code

Here's a version that follows these suggestions:

#define MAT1 3
#define a(i,j) a[(i)*MAT1+(j)]

int h_NR_LU_decomp(float *a, int *indx)
{
        int i, j, k;
        int n = MAT1;

        for (i = 0; i < n; i++) {
                // compute R
                for (j = i; j < n; j++)
                        for (k = 0; k < i-2; k++)
                                a(i,j) -= a(i,k) * a(k,j);

                // compute L
                for (j = i+1; j < n; j++)
                        for (k = 0; k < i-2; k++)
                                a(j,i) -= a(j,k) * a(k,i);
        }

        return 0;
}

Its main advantages are:

it's readable
it works

It lacks pivoting, though. Add sub-functions as needed.

My advice: don't copy someone else's code without understanding it.

Most programmers are bad programmers.

answered Apr 17 '11 at 01:43

Philip

5,795
3
33
68

+1 for advice, another +1 if I could for "Most programmers are bad programmers" – mokus Apr 17 '11 at 13:08
Thankyou @phillip, but I think that your loop conditions are incorrect; (I'm now reading from a 1980 mathematics textbook for my examples) I think your k – Bolster Apr 17 '11 at 15:04
Actually, after actually taking some time to experiment, I don't think this is correct. – Bolster Apr 17 '11 at 20:01
It is lacking a division operation in the computing of L, and I'm not sure about the indices. Actually, I only tested this with your example matrix... My "Numerical Analysis" book was (and still is) in the other room. If I could only find my implementation from about 5 years ago... I know that I had to implement LU decomposition for one of my math classes at college. – Philip Apr 17 '11 at 21:40
Marking as answered as it initially sent me in the right direction down the rabbit hole – Bolster Apr 18 '11 at 11:28
another +1 for "Most (many ?) programmers are bad programmers" – denis Apr 25 '11 at 11:30

score 10 · Answer 2 · answered Apr 17 '11 at 03:03

For the love of all that is holy, don't use Numerical Recipies code for anything except as a toy implementation for teaching purposes of the algorithms described in the text -- and, really, the text isn't that great. And, as you're learning, neither is the code.

Certainly don't put any Numerical Recipies routine in your own code -- the license is insanely restrictive, particularly given the code quality. You won't be able to distribute your own code if you have NR stuff in there.

See if your system already has a LAPACK library installed. It's the standard interface to linear algebra routines in computational science and engineering, and while it's not perfect, you'll be able to find lapack libraries for any machine you ever move your code to, and you can just compile, link, and run. If it's not already installed on your system, your package manager (rpm, apt-get, fink, port, whatever) probably knows about lapack and can install it for you. If not, as long as you have a Fortran compiler on your system, you can download and compile it from here, and the standard C bindings can be found just below on the same page.

The reason it's so handy to have a standard API to linear algebra routines is that they are so common, but their performance is so system-dependant. So for instance, Goto BLAS is an insanely fast implementation for x86 systems of the low-level operations which are needed for linear algebra; once you have LAPACK working, you can install that library to make everything as fast as possible.

Once you have any sort of LAPACK installed, the routine for doing an LU factorization of a general matrix is SGETRF for floats, or DGETRF for doubles. There are other, faster routines if you know something about the structure of the matrix - that it's symmetric positive definite, say (SBPTRF), or that it's tridiagonal (STDTRF). It's a big library, but once you learn your way around it you'll have a very powerful piece of gear in your numerical toolbox.

Normally I'd agree, but I'm on a masochistic mission to 'See for myself', unfortunately my linear algebra isn't exactly up to scratch for this kind of thing! Thanks — Bolster, Apr 17 '11 at 19:21

score 3 · Answer 3 · answered Apr 17 '11 at 10:31

The thing that looks most suspicious to me is the part marked "search for largest pivot". This does not only search but it also changes the matrix A. I find it hard to believe that is correct.

The different version of the LU algorithm differ in pivoting, so make sure you understand that. You cannot compare the results of different algorithms. A better check is to see whether L times U equals your original matrix, or a permutation thereof if your algorithm does pivoting. That being said, your result is wrong because the determinant is wrong (pivoting does not change the determinant, except for the sign).

Apart from that @Philip has good advice. If you want to understand the code, start by understanding LU decomposition without pivoting.

talonmies · Answer 4 · 2011-04-18T10:01:51.643

To badly paraphrase Albert Einstein:

... a man with a watch always knows the exact time, but a man with two is never sure ....

Your code is definitely not producing the correct result, but even if it were, the result with pivoting will not directly correspond to the result without pivoting. In the context of a pivoting solution, what Alpha has really given you is probably the equivalent of this:

    1  0  0      1  0  0       1  3 -2
P=  0  1  0   L= 2  1  0  U =  0 -2  7
    0  0  1      3  2  1       0  0 -2

which will then satisfy the condition A = P.L.U (where . denotes the matrix product). If I compute the (notionally) same decomposition operation another way (using the LAPACK routine dgetrf via numpy in this case):

In [27]: A
Out[27]: 
array([[ 1,  3, -2],
       [ 3,  5,  6],
       [ 2,  4,  3]])

In [28]: import scipy.linalg as la

In [29]: LU,ipivot = la.lu_factor(A)

In [30]: print LU
[[ 3.          5.          6.        ]
 [ 0.33333333  1.33333333 -4.        ]
 [ 0.66666667  0.5         1.        ]]

In [31]: print ipivot
[1 1 2]

After a little bit of black magic with ipivot we get

    0  1  0       1        0    0       3  5       6
P = 0  0  1   L = 0.33333  1    0  U =  0  1.3333 -4
    1  0  0       0.66667  0.5  1       0  0       1

which also satisfies A = P.L.U . Both of these factorizations are correct, but they are different and they won't correspond to a correctly functioning version of the NR code.

So before you can go deciding whether you have the "right" answer, you really should spend a bit of time understanding the actual algorithm that the code you copied implements.

Very true, and over the past few day's I've realised I'm alot less knowledgeable in this area than I assumed. If I could assign 2 answered's I would. — Bolster, Apr 18 '11 at 11:28

score 0 · Answer 5 · edited Jun 12 '22 at 22:20

This thread has been viewed 6k times in the past 10 years. I had used NR Fortran and C for many years, and do not share the low opinions expressed here.

I explored the issue you encountered, and I believe the problem in your code is here:

for (k=j+1;k<n;k++) {
    dum=1.0/(a[j*MAT1+j]);
    for (i=j+1;i<n;i++) a[i*MAT1+j] *= dum;
}

while in the original if (j != n-1) { ... } is used. I think the two are not equivalent.

NR's lubksb() does have a small issue in the way they set up finding the first non-zero element, but this can be skipped at very low cost, even for a large matrix. With that, both ludcmp() and lubksb(), entered as published, work just fine, and as far as I can tell perform well.

Here's a complete test code, mostly preserving the notation of NR, wth minor simplifications (tested under Ubuntu Linux/gcc):

/* A sample program to demonstrate matrix inversion using the 
 * Crout's algorithm from Teukolsky and Press (Numerical Recipes):
 * LU decomposition + back-substitution, with partial pivoting
 * 2022.06 edward.sternin at brocku.ca
 */
#define N 7
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define a(i,j) a[(i)*n+(j)]
/* implied 1D layout is a(0,0), a(0,1), ... a(0,n-1), a(1,0), a(1,1), ... */

void matrixPrint (double *M, int nrow, int ncol) {
  int i,j;
  for (i=0;i<nrow;i++) {
    for (j=0;j<ncol;j++) { fprintf(stderr," %+.3f\t",M[i*ncol+j]); }
    fprintf(stderr,"\n");
    }
  }

void die(char msg[]) {
  fprintf(stderr,"ERROR in %s, aborting\n",msg); 
  exit(1);
  }

void ludcmp(double *a, int n, int *indx) {
  int i, imax, j, k;
  double big, dum, sum, temp;
  double *vv;

/* i=row index, i=0..(n-1); j=col index, j=0..(n-1) */

  vv=(double *)malloc((size_t)(n * sizeof(double)));
  if (!vv) die("ludcmp: allocation failure");

  for (i = 0; i < n; i++) { /* loop over rows */
    big = 0.0; 
    for (j = 0; j < n; j++) {
      if ((temp=fabs(a(i,j))) > big) big=temp;
      }
    if (big == 0.0) die("ludcmp: a singular matrix provided");
    vv[i] = 1.0 / big;  /* vv stores the scaling factor for each row */
    }

  for (j = 0; j < n; j++) {  /* Crout's method: loop over columns */
    for (i = 0; i < j; i++) { /* except for i=j */
      sum = a(i,j);
      for (k = 0; k < i; k++) { sum -= a(i,k) * a(k,j); }
      a(i,j) = sum;  /* Eq. 2.3.12, in situ */
      }
    big = 0.0;  /* searching for the largest pivot element */
    for (i = j; i < n; i++) {
      sum = a(i,j);
      for (k = 0; k < j; k++) { sum -= a(i,k) * a(k,j); }
      a(i,j) = sum;
      if ((dum = vv[i] * fabs(sum)) >= big) {
        big = dum;
        imax = i;
        }
      }
    if (j != imax) { /* if needed, interchange rows */
      for (k = 0; k < n; k++){
        dum = a(imax,k);
        a(imax,k) = a(j,k);
        a(j,k) = dum;
        }
      vv[imax] = vv[j]; /* keep the scale factor with the new row location */
      }  
    indx[j] = imax;
    if (j != n-1) {     /* divide by the pivot element */
      dum = 1.0 / a(j,j);
      for (i = j + 1; i < n; i++) a(i,j) *= dum;
      }
  }
  free(vv);
}
  
void lubksb(double *a, int n, int *indx, double *b) {
  int i, ip, j;
  double sum;

  for (i = 0; i < n; i++) {
    /* Forward substitution, Eq.2.3.6, unscrambling permutations from indx[] */
    ip = indx[i];
    sum = b[ip];
    b[ip] = b[i];
    for (j = 0; j < i; j++) sum -= a(i,j) * b[j];
    b[i] = sum;
    }
  for (i = n-1; i >= 0; i--) { /*  backsubstitution, Eq. 2.3.7 */
    sum = b[i];
    for (j = i + 1; j < n; j++) sum -= a(i,j) * b[j];
    b[i] = sum / a(i,i);
    }
  }  

int main() {
  double *a,*y,*col,*aa,*res,sum;
  int i,j,k,*indx;

  a=(double *)malloc((size_t)(N*N * sizeof(double)));
  y=(double *)malloc((size_t)(N*N * sizeof(double)));
  col=(double *)malloc((size_t)(N * sizeof(double)));
  indx=(int *)malloc((size_t)(N * sizeof(int)));
  aa=(double *)malloc((size_t)(N*N * sizeof(double)));
  res=(double *)malloc((size_t)(N*N * sizeof(double)));
  if (!a || !y || !col || !indx || !aa || !res) die("main: memory allocation failure");

  srand48((long int) N);
  for (i=0;i<N;i++) {
    for (j=0;j<N;j++) { aa[i*N+j] = a[i*N+j] = drand48(); }
    } 
  fprintf(stderr,"\nRandomly generated matrix A = \n");
  matrixPrint(a,N,N);

  ludcmp(a,N,indx);

  for(j=0;j<N;j++) {
    for(i=0;i<N;i++) { col[i]=0.0; }
    col[j]=1.0;
    lubksb(a,N,indx,col);
    for(i=0;i<N;i++) { y[i*N+j]=col[i]; }
    }
  fprintf(stderr,"\nResult of LU/BackSub is inv(A) :\n");
  matrixPrint(y,N,N);

  for (i=0; i<N; i++) { 
    for (j=0;j<N;j++) {
      sum = 0;
      for (k=0; k<N; k++) { sum += y[i*N+k] * aa[k*N+j]; }
      res[i*N+j] = sum;
      }
    }
  fprintf(stderr,"\nResult of inv(A).A = (should be 1):\n");
  matrixPrint(res,N,N);

  return(0);
  }

LU Decomposition from Numerical Recipes not working; what am I doing wrong?

5 Answers5

Linked