Armadillo's solve(A, b) returning different answer from Matlab, Eigen

Question

I am using Armadillo 5.200 with Visual Studio 2013 to solve a system of linear equations xA=b. My code calculates this by solving x=(A'\b')', and I am using Armadillo's solve() function to solve the system of linear equations.

My problem is that the solution returned to me is incorrect-- it is different from both the Eigen and Matlab solutions to the same problem, which match up. The reason I'm not just using Eigen in the first place is that it performs too slowly (I have to make the calculation several million times) and I want to see if using Armadillo speeds it up at all. I'm also just curious at this point why this error is occurring.

Here is a piece of sample code which illustrates the problem:

#include "stdafx.h"
#include <iostream>
#include <armadillo>
#include <Eigen/Dense>


// Matrix operation to find world coordinates.  
arma::mat mrdivide(arma::mat A, arma::mat B){
    arma::mat A_t = A.t();
    arma::mat B_t = B.t();  
    return solve(A_t, B_t).t();

}
Eigen::MatrixXd mrdivide(Eigen::MatrixXd A, Eigen::MatrixXd B){
    Eigen::MatrixXd A_t = A.transpose();
    Eigen::MatrixXd B_t = B.transpose();    
    return ((A_t).colPivHouseholderQr().solve(B_t)).transpose();

}
int main(int argc, char* argv[])
{

    Eigen::Matrix<double, 4, 3>  M_eig;
    M_eig << 761.544, 0, 0,
             0, 761.544, 0,
             639.5, 399.5, 1.0,
             3.762513283904080e+06, 1.824431013104484e+06, 9.837714402800992e+03;
    arma::mat M_arma;
    M_arma << M_eig(0, 0) << M_eig(0, 1) << M_eig(0, 2) << arma::endr
        << M_eig(1, 0) << M_eig(1, 1) << M_eig(1, 2) << arma::endr
        << M_eig(2, 0) << M_eig(2, 1) << M_eig(2, 2) << arma::endr
        << M_eig(3, 0) << M_eig(3, 1) << M_eig(3, 2) << arma::endr;

    Eigen::Matrix<double, 1, 3> pixelCoords_eig;
    pixelCoords_eig << 457, 520, 1;
    arma::mat pixelCoords_arma;
    pixelCoords_arma << 457 << 520 << 1 << arma::endr;

    Eigen::Matrix<double, 1, 4> worldCoords_eig;
    arma::mat worldCoords_arma;

    worldCoords_arma = mrdivide(M_arma, pixelCoords_arma);
    worldCoords_eig = mrdivide(M_eig, pixelCoords_eig);
    std::cout << "world coords arma: " << worldCoords_arma << std::endl;
    std::cout << "world coords eig: " << worldCoords_eig << std::endl;   

}

My output is:

world coords arma: 5.3599e-002  4.242e-001  1.3120e-001  8.8313e-005
world coors eig: .0978826  .439301  0   0.00010165

The eigen solution here is the correct one (the one Matlab gives me, and makes logical sense with the calculation I'm performing). Why would Armadillo give me the incorrect solution?

ggael · Answer 1 · 2015-07-31T07:07:29.293

Regarding Eigen's speed, you can get a significant boost by removing all heap allocation this way (almost one order of magnitude faster):

Matrix<double, 1, 4> mrdivide(const Matrix<double, 4, 3> &A,
                              const Matrix<double, 1, 3> &B){
  return A.transpose().colPivHouseholderQr().solve(B.transpose());
}

You can also save 10% more by using a row-major matrix for M_eig so that A.transpose() is column-major:

Matrix<double, 4, 3,RowMajor>  M_eig;
Matrix<double, 1, 4> mrdivide(const Matrix<double, 4, 3, RowMajor> &A,
                              const Matrix<double, 1, 3> &B);

Finally, since your problem is numerically well conditioned, you can also use a Cholesky-based solver for an additional x2 speedup (in this case keep the default storage for M_eig):

Matrix<double, 1, 4> mrdivide(const Matrix<double, 4, 3> &A, const Matrix<double, 1, 3> &B){
  return (A*A.transpose()).ldlt().solve(A*B.transpose());
}

For completeness, here is a self-contained example implementing the QR and LDLT solutions:

#include <iostream>
#include <Eigen/Dense>
using namespace Eigen;

Matrix<double, 1, 4> mrdivide_qr(const Matrix<double, 4, 3> &A,
                              const Matrix<double, 1, 3> &B){
  return A.transpose().colPivHouseholderQr().solve(B.transpose());
}

Matrix<double, 1, 4> mrdivide_ldlt(const Matrix<double, 4, 3> &A, const Matrix<double, 1, 3> &B){
  return (A*A.transpose()).ldlt().solve(A*B.transpose());
}

int main(int argc, char* argv[])
{
    Matrix<double, 4, 3>  M_eig;
    M_eig << 761.544, 0, 0,
            0, 761.544, 0,
            639.5, 399.5, 1.0,
            3.762513283904080e+06, 1.824431013104484e+06, 9.837714402800992e+03;

    Matrix<double, 1, 3> pixelCoords_eig;
    pixelCoords_eig << 457, 520, 1;

    Matrix<double, 1, 4> worldCoords_eig;

    worldCoords_eig = mrdivide_qr(M_eig, pixelCoords_eig);
    std::cout << "world coords using QR:   " << worldCoords_eig << std::endl;

    worldCoords_eig = mrdivide_ldlt(M_eig, pixelCoords_eig);
    std::cout << "world coords using LDLT: " << worldCoords_eig << std::endl;
}

mrdivide_ldlt is twise as fast as mrdivide_qr which is itself much faster than the mrdivide function of the question.

Using your first and second suggestions (transpostion of M_eig will not be remaining upper triangular, so I don't think I can use ldlt() unfortunately) I get a "YOU_MIXED_MATRICES_OF_DIFFERENT_SIZES" compilation error. Going back to using `MatrixXd` rather than `Matrix` and `Matrix` gets rid of this error, but that brings me back to my original problem of speed. — mjr_lzr, Jul 30 '15 at 15:07
Figured it out-- return line should be `return A.transpose().colPivHouseholderQr().solve(B.transpose()).transpose();`. That did speed it up quite a bit, thanks! — mjr_lzr, Jul 30 '15 at 15:28
hm, the last transpose should not be needed as transposing vectors in assignment is implicit. Also, for ldlt, not that you to pass `A*A.transpose()` and `A*B.transpose()` to get the minimal norm solution, not the original matrices! — ggael, Jul 30 '15 at 20:12
The solution of `A.transpose().colPivHouseholderQr().solve(B.transpose())` is a 4x1 matrix, however, and since I'm solving (A'\b')' for a 1x4 matrix I think I do end up needing the final transposition. LDLT is also not working for me-- I don't think I can use it since I'll be replacing M_eig with matrices that are not upper or lower triangular-- but as for right now, I keep getting buffer overrun errors. — mjr_lzr, Jul 30 '15 at 21:30
hm, you must be doing something weird. I edited the answer with a working self-contained example. — ggael, Jul 31 '15 at 07:07
Or there might be something else funky going on with my setup and Visual Studio... the code you gave me doesn't build! I get the same problem of it doesn't like that I declare the return value of `mrdivide_qr` to be `Matrix` unless I transpose it first. I left it `MatrixXd` just to see what it would do (since `worldCoords_eig` is `Matrix`) and it spat out nonsense numbers for the world coordinates. I also keep getting the same buffer overrun error when I try to run `mrdivide_ldlt`... very strange... — mjr_lzr, Jul 31 '15 at 14:40
I got mrdivide_ldlt to work in your sample code, but it again required transposing the solution. — mjr_lzr, Jul 31 '15 at 15:44

score 1 · Accepted Answer · 2015-08-02T17:23:29.223

Solution Armadillo is correct,because you solve overdetermined system.
Thus the solutions obtained by different methods may differ. The following code demonstrates how to use the library Armadilio, get a result similar Matlab. For simplicity, I did not fully realize the algorithm and did not check on the correctness of the argument function, so swapping rows is artificial (and not really necessary).

# include <iostream>
# include <armadillo>
# include <algorithm>
# include <array>

arma::mat qr_solve2(const arma::mat &A,const arma::mat &B);

arma::mat mrdivide(arma::mat A, arma::mat B)
    {
    return solve(A.t(), B.t());
    }
int main()
    {
    std::array<double,12> arr = {
        761.544 , 0 ,639.5,
        3.762513283904080e+06,0 , 761.544,
        399.5,1.824431013104484e+06,0,
        0, 1.0, 9.837714402800992e+03
        };
    arma::mat A(4,3);
    std::copy(arr.begin(),arr.end(),A.begin());
    arma::mat B;
    B << 457 << 520 << 1 << arma::endr;
    arma::mat V = mrdivide(A,B);
    std::cout<<V<<std::endl;
    std::cout<<A.t()*V<<std::endl;
    arma::mat Z = qr_solve2(A.t(),B.t());
    std::cout<<Z<<std::endl;
    std::cout<<A.t()*Z<<std::endl;
    return 0;
    }
arma::mat qr_solve2(const arma::mat &A,const arma::mat &B)
    {
    arma::mat Q, R;
    arma::qr(Q,R,A);
    unsigned int s = R.n_rows-1;
    arma::mat R_ = R( arma::span(0,s), arma::span(0,s-1) ) ;

    R_ = arma::join_horiz(R_,R.col(s+1));
    arma::mat newB = Q.t()*B;
    arma::mat X(s+1,1);

    for (int i = s; i >= 0; i--)
        {
        X[i] = newB[i];
        for (int j = s; j != i; j--)
            X[i] = X[i] - R_(i,j) * X[j];
        X[i] = X[i]/R_(i,i);
        }

    arma::mat res = X( arma::span(0,s-1), arma::span(0,0) ) ;

    res =  arma::join_vert(res,arma::mat(1,1, arma::fill::zeros));
    res = arma::join_vert(res,X.row(s));
    return res;
    }

As always, an indication of size matrix at compile time reduce the cost of memory allocation, so use it to speed up computing.

Armadillo's solve(A, b) returning different answer from Matlab, Eigen

2 Answers2