Why my natural log function is so imprecise?

Question

Firstly, I'm using this approximation of a natural log. Or look here (4.1.27) for a better representation of formula.

Here's my implementation:

constexpr double eps = 1e-12;

constexpr double my_exp(const double& power)
{
    double numerator = 1;
    ull denominator = 1;
    size_t count = 1;
    double term = numerator / denominator;
    double sum = 0;
    while (count < 20)
    {
        sum += term;
        numerator *= power;
        #ifdef _DEBUG
            if (denominator > std::numeric_limits<ull>::max() / count)
                throw std::overflow_error("Denominator has overflown at count " + std::to_string(count));
        #endif // _DEBUG
        denominator *= count++;
        term = numerator / denominator;
    }
    return sum;
}

constexpr double E = my_exp(1);

constexpr double my_log(const double& num)
{
    if (num < 1)
        return my_log(num * E) - 1;
    else if (num > E)
        return my_log(num / E) + 1;
    else
    {
        double s = 0;
        size_t tmp_odd = 1;
        double tmp = (num - 1) / (num + 1);
        double mul = tmp * tmp;
        while (tmp >= eps)
        {
            s += tmp;
            tmp_odd += 2;
            tmp *= mul / tmp_odd;
        }
        return 2 * s;
    }
}

You probably can see why I want to implement these functions. Basically, I want to implement a pow function. But still my approach gives very imprecise answers, for example my_log(10) = 2.30256, but according to google (ln 10 ~ 2.30259).

my_exp() is very precise since it's taylor expansion is highly convergant. my_exp(1) = 2.718281828459, meanwhile e^1 = 2.71828182846 according to google. But unfortunately it's not the same case for natural log, and I don't even know how is this series for a natural log derived (I mean from the links above). And I couldn't find any source about this series.

Where's the precision errors coming from?

@463035818_is_not_a_number for reasearch purposes, just wanted to see with my own eyes that precision of log is enough to compute pow precisely. But stuck in implementing log function, and apperently it is much harder than I expected. — Learpcs, Jan 17 '22 at 12:10
I've struggled with the same thing, the taylor series just doesn't seem to converge well for large or small values (even ln(x)=-ln(1/x) for larger numbers doesn't help) — Pepijn Kramer, Jan 17 '22 at 12:10
no `constexpr` isnt the issue. The wikipedia article has a section on series expansions https://en.wikipedia.org/wiki/Logarithm. The one with limited convergence radius is rather poor, but thats not the one you are using. I'd try to check intermediate results and check them with https://www.wolframalpha.com/ or similar — 463035818_is_not_an_ai, Jan 17 '22 at 12:19
You do a lot of operations on floating point and each of them leads to some approximated value. If you stack many " little imprecise" calculations, you get large imprecision. Firstly I would try to use "long double" (__float128 ?). Unless it helps, you probably need to implement the type (or find a library) with high precision of calculations. — Karol T., Jan 17 '22 at 12:25
Should `denominator *= count++;` be `denominator *= ++count;` ? — Richard Critten, Jan 17 '22 at 12:26
@RichardCritten e^x = 1 + x + x^2/2! ... . So basically I have two terms divided by one, so my option with postincrement is right — Learpcs, Jan 17 '22 at 12:28
I found some more info here : https://math.stackexchange.com/questions/3381629/what-is-the-fastest-algorithm-for-finding-the-natural-logarithm-of-a-big-number — Pepijn Kramer, Jan 17 '22 at 12:31
@KarolT. I thought about too. But the problem is - this series sums litterally 8 terms, and then already falls into epsilon neighbourhood. So I doubt it — Learpcs, Jan 17 '22 at 12:32
@KarolT.: "If you stack many " little imprecise" calculations, you get large imprecision. ". That is only true for naïve implementations. Smarter numerical algorithms arrange the calculations such that errors will cancel, even if you have more calculations. `long double` is rarely a good solution. — MSalters, Jan 17 '22 at 12:42
@Learpcs -- Did you copy your implementation from a "regular" mathematics book, or a book that is geared towards implementing various formulas to tailor the round-off associated with a computing machine? If it's the former and not the latter, your results are not surprising. Formulas straight from a mathematics book without proper code to handle round-off error will give such answers. — PaulMcKenzie, Jan 17 '22 at 13:58

score 3 · Answer 1 · answered Jan 17 '22 at 12:52

if (num < 1) return my_log(num * E) - 1; has an imprecision in the multiplication. Multiplying by 2 is more accurate. Of course, my_log(num) = my_log(2*num) - ln(2) so you'll need to change the 1.0 constant.

Yes, now you'll have a rounding error in -ln(2) instead of a rounding error in *E. That's typically less bad.

Also, you can save repeated rounding errors by first checking if (num<1/16) and then use my_log(num) = my_log(16*num) - ln(16). That's only a single rounding error.

As for the error in your core loop, I suspect the culprit is s += tmp;. This is a repeated addition. You can use Kahan summation there.

https://onlinegdb.com/6M8MrqTl6v I did __float128 instead of double and made kahan summation. Result still the same. I have big precision error. I didn't implemented idea with ln(2), but I was checking errors with ln(1.5) which is in range already, and I don't need make extra multiplications/divisions. Probably it's something wrong with the series or my implementation. I'll try another series, I guess. — Learpcs, Jan 17 '22 at 14:52

score 2 · Accepted Answer · answered Jan 17 '22 at 17:39

The line tmp *= mul / tmp_odd; means that each term is also being divided by the denominators of all previous terms, i.e. 1, 1*3, 1*3*5, 1*3*5*7, ... rather than 1, 3, 5, 7, ... as the formula states.

The numerator and denominator should therefore be computed independently:

double sum = 0;
double value = (num - 1) / (num + 1);
double mul = value * value;
size_t denom = 1;
double power = value;
double term = value;
while (term > eps)
{
    sum += term;
    power *= mul;
    denom += 2;
    term = power / denom;
}
return 2 * sum;

...

// Output for num = 1.5, eps = 1e-12
My func:   0.405465108108004513
Cmath log: 0.405465108108164385
             ------------

Much better!

Reducing the epsilon to 1e-18, we hit the accuracy limits of naïve summation:

// Output for num = 1.5, eps = 1e-18
My func:   0.40546510810816444
Cmath log: 0.405465108108164385
             ---------------

Kahan-Neumaier to the rescue:

double sum = 0;
double error = 0;
double value = (num - 1) / (num + 1);
double mul = value * value;
size_t denom = 1;
double power = value;
double term = value;
while (term > eps)
{
    double temp = sum + term;
    if (abs(sum) >= abs(term))
        error += (sum - temp) + term;
    else
        error += (term - temp) + sum;
    sum = temp;
    power *= mul;
    denom += 2;
    term = power / denom;
}
return 2 * (sum + error);

...

// Output for num = 1.5, eps = 1e-18
My func:   0.405465108108164385
Cmath log: 0.405465108108164385

You have no idea, how I'm thankful to you for finding such a lame mistake. Also I want to add: it's probably a bad idea to implement `exp()` function like I did. Firstly, you can't calculate `exp(120)` (std::pow() can). Secondly, I had again precision problems, but now with `my_pow()` function itself. It's better to implement `exp()` without those additional long long variables (denominator and numerator I mean). Or keep them but make them double too. I have some suspicions, but I can't see clearly why long long is the key problem here. — Learpcs, Jan 17 '22 at 19:33
@Learpcs `120!` is far too large for even `unsigned long long` to store. Simply accumulating the series term with `term *= power / count++` avoids explicitly computing the factorial. Funnily enough this is the exact opposite of the problem you had with `my_log`. However a simple summation alone would take very long to converge for an input as large as `120`; you could adopt a similar strategy to `my_log` by reducing the argument with exponentiation rules - see [my answer to an old post](https://stackoverflow.com/a/49988655/8204776). — meowgoesthedog, Jan 17 '22 at 22:35
I feel slightly silly for trying to spot a bad numerical approach while overlooking a bug this big. But the fact that you still get 5 correct digits with the bug shows how fast the series converges - the range reduction means that the `num-1` terms is always pretty small, and thus its powers are even smaller. — MSalters, Jan 18 '22 at 09:51

Why my natural log function is so imprecise?

2 Answers2