2

Consider the following weighted solution to the normal equation for a least squares inverse problem:

m = inv(G'*W'*W*G)*G'*W'*W*d

I would like to set up the weighting matrix W, which is a square diagonal matrix with weights on the diagonal.

As I have a large number of data points in d (10⁷), my system matrix G is also large, but only in one dimension (as I have far more data points than model parameters). In the case of 6 model parameters, G is of size (10⁷ × 6). Hence, W has to be of size (10⁷ × 10⁷). However, it is sparse, with only 10⁷ non-zero entries (the weights).

To reduce memory, I use sparse on W.

To assign weights, I do the following

d = [d1;d2];   
W = sparse(length(d),length(d)) 
w1 = length(d2)/length(d);
w2 = length(d1)/length(d);
W(1:length(d)+1:length(d)*length(d1)) = w1;
W(length(d)*length(d1)+1:length(d)+1:end) = w2;

d1 and d2 are column vectors with observations.

This will assign weights to the diagonal, but it is awfully slow.

My question:

Can I either

  • Somehow speed up the assignments of weights to the diagonal, or
  • Re-write m = inv(G'*W'*W*G)*G'*W'*W*d so that I do not have to set-up W at all?

Note 1: In the shown weights are two different constants, but in practice they will vary allowing the diagonal!

Note 2: The bottle neck of the code is indeed setting up W, not the inversion itself, as the inverted matrix is only of size (6 × 6).

TheodorBecker
  • 249
  • 1
  • 17
  • What are `d1` and `d2`? Please post runnable code. Also, `d = [d1;d2]` implies `w1` and `w2` are just ones and `W` is `eye`? Why _two_ diagonal assignments (last two lines)? – Luis Mendo Jul 06 '15 at 10:45
  • @LuisMendo I edited the code. However, I don't see why W should be eye. w1 and w2 are weightes, in this case depending on the length of the respective observation vector. But they could be any float. – TheodorBecker Jul 06 '15 at 10:53
  • `d = [d1;d2]` implies `length(d1)==length(d2)`. So `w1` and `w2` are `1` – Luis Mendo Jul 06 '15 at 10:54
  • 1
    d = [d1;d2] is a vertical concatenation of n x 1 and m x 1 vectors and as such does not imply anything on the length of d1 and d2 except length(d1)+length(d2) = length(d) – TheodorBecker Jul 06 '15 at 10:57
  • Ah, ok. I thought they were row vectors. You should indicate that in the question – Luis Mendo Jul 06 '15 at 10:59
  • 6
    Have you tried using `W = sparse([1:numel(d1) 1:numel(d2)], 1:numel(d), [w1; w2], numel(d), numel(d));`, where `w1` and `w2` are column vectors? This replaces the intiallization `W = sparse(length(d),length(d));` and the two assignment lines – Luis Mendo Jul 06 '15 at 11:10
  • 1
    @LuisMendo: Great, I did not realize such initialization of sparse matrices was possible. It's actually even given in the documentary. It is faster by several orders of magnitude. Thanks a lot! – TheodorBecker Jul 06 '15 at 12:03
  • If the weights matrix is diagonal, then W' = W and W'W is also a diagonal matrix with squared weights on the diagonal. I think this could be represented as a vector and handled specially on the multiplication. – duffymo Jul 06 '15 at 12:47
  • With the code you provided, your matrix `W` is not diagonal, but rather, a horizontal concatenation of two diagnoal matrices. This will fail the matrix inversion, because this makes `G.'·W.'·WG` singular. Can you check if the code is correct by looking at `full(W)` for a small toy problem? – Rody Oldenhuis Jul 07 '15 at 15:34
  • 2
    If `W` is truly diagonal, then you can better do `bsxfun(@times, G, [length(d2) length(d1)]/length(d))` to compute `G.' · W.'`. What are `WG` and `Wd`? – Rody Oldenhuis Jul 07 '15 at 15:43
  • @RodyOldenhuis: I edited the post. It was not very readable. Now the multiplication signs should make it clearer. W is and was truly diagonal, also with my first approach. I checked it with full on a subset. – TheodorBecker Jul 07 '15 at 16:47
  • But if I do `d1 = rand(3,1); d2 = rand(5,1);` and then execute the code above for constructing `W`, I do not get a diagonal matrix... – Rody Oldenhuis Jul 08 '15 at 04:36

1 Answers1

0

By the structure of W I'd say it is more efficient to split G into two halves and calculate the products with W as scalar products:

G1 = G(1:length(d1),:);
G2 = G(length(d1)+1:end,:);

m = (w1^2*G1'*G1 + w2^2*G2'*G2) \ (w1^2*G1'*d1 + w2^2*G2'*d2);

Also, generally use A \ b instead of inv(A)*b (even if A is small).

chtz
  • 17,329
  • 4
  • 26
  • 56