Performance bottleneck because of high "erf" function use

Question

I'm using large matrices (100x100 to 3000x3000) to do some claculations (a lot of sums and matrix-vector multiplications), I'm using the Eigen Library for my vectors and matrices. My code is simple C-like code (only functions, no classes) and is going to be compiled as a DLL to be used on Excel.

I identified a bottleneck in the following code :

// Q(z) matrix function
Eigen::MatrixXd qzMatrix(const Eigen::MatrixXd& xjk, const float& riskFreeRate, 
                         const float& volatility, const float& rebalancingPeriod)
{
    int j, k, r = xjk.rows(), c = xjk.cols();
    Eigen::MatrixXd matrix(r, c);
    double mu = (riskFreeRate - volatility * volatility / 2) * rebalancingPeriod;
    double s = volatility * rebalancingPeriod;

    for (j = 0; j <= r - 1; ++j)
        for (k = 0; k <= c - 1; ++k)
            matrix(j, k) = (xjk(j, k) > 0) ? 0.5*(1 + erf(((log(xjk(j, k)) - mu) / s) * M_SQRT1_2)) : 0;

    return matrix;
}

There is also a second function, which is similar to this one, where the erf function takes a different argument (erf function is used here to calculate the standard normal cdf). The two functions take matrices (xjk in this case) of large sizes (usually around 1000x1000), and are looped at least 120 times (1000x1000x2x120 = 240 million calls of erf function). I already tried using the GSL normal CDF function, but is it slighly slower than the native C++ erf function.

Running the algorithm on a 1000x1000 matrix for 120 times takes about 45 seconds. I'm using mingw-w64 4.9.2, CodeBlocks, I have a Windows 7 x64, 4Go RAM, i5.

Is there a way I could speed it this algorithm ?

How accurate does it need to be? Have you considered interpolation from a precalculated lookup table? — Ben Voigt, Mar 11 '15 at 15:37
It has to be of the highest possible accuracy, the final 120 generated matrices are all stochastic matrices (sum of the elements of each line (1000 element) is equal to 1), I actually considered interpolation, but didn't give it much thoughts because of the accuracy of the values. — Naucle, Mar 11 '15 at 15:41
Actually I think @BenVoigt's idea is a good one, especially if you fit a cubic spline. The error function is approximately cubic, and, in particular, the fourth power coefficient is zero. Given that you're using `float`s, I don't imagine you'll need too many interpolator knots, and you can easily calibrate that to the *real* function. A day well-spent in my opinion. — Bathsheba, Mar 11 '15 at 15:41
@Bathsheba And how can I achieve that ? I'm just a beginner at C++ by the way. — Naucle, Mar 12 '15 at 10:34
If you need "the highest possible accuracy", you would not want to compute the CDF of the standard normal distribution via `erf`, as this will lose accuracy in the tails. Using the complementary error function `erfc` instead results in computation that is accurate in the tails as well: `normcdf(x) = 0.5 * erfc (-sqrt (0.5) * x)`, where the term `-sqrt(0.5)` can be computed as a floating-point literal ahead of time, of course. — njuffa, Jan 17 '16 at 23:48

Performance bottleneck because of high "erf" function use

0 Answers0

Linked