0

I'm in the process of converting a program from Scilab code to C++ and it's essential for me to maintain the results produced by Scilab.

I'm aware that Scilab uses IEEE 754 double precision and that C++ doubles (although not required to) are implemented in a similar way.

Is it then a bad idea to use higher precision (e.g. long double) in C++ if I'm trying to exactly match the results of Scilab?

For example: Is it possible for Scilab to calculate a number to be 0.1234 whereas in C++ using long doubles the number would be 0.12345. Thus potentially creating a variance that would result in the two programs producing different results (albeit more accurate in C++).

Paul Warnick
  • 903
  • 2
  • 13
  • 26
  • 2
    I'm willing to bet that you'll never get *exactly* identical results. But do you *really* need to? – Jesper Juhl Jun 20 '16 at 18:34
  • @JesperJuhl Technically speaking, I don't need the results to be exact. The problem is, there is a lot of looping in my program and if the initial loop is off by even a tiny fraction (which it is), the end difference becomes significant. – Paul Warnick Jun 20 '16 at 18:37
  • If the end difference is significant doesn't that mean your program requires greater precision than built-in floating point will allow? – Galik Jun 20 '16 at 18:49
  • @Galik Not that I'm aware of. The reason the end difference becomes significant is because of the initial error combined with the number of calculations this error is subject to. Meaning, if on the first iteration of my program there is an error of 0.000000000001, that difference becomes amplified as the program runs. For example, by the 1,000th iteration that error has become 0.0001. Then by the 1,000,000th iteration the error is rather large. – Paul Warnick Jun 20 '16 at 18:54
  • @Galik This is because each step of the loop depends on values calculated in the previous step. Therefore the error grows larger and larger as the program runs. – Paul Warnick Jun 20 '16 at 18:56
  • I understand how the error accumulates to become "significant". What I mean is why is one floating point unit's accumulated difference any "better" than the other? It sounds like you may be simply preferring the inaccuracies of one floating point unit over those of another. – Galik Jun 20 '16 at 19:08
  • @Galik Ah, my bad. I mean if the results from using just a regular double in C++ were identical (correct). Would it be a problem to then use a long double? Meaning, can increasing the precision actually cause an issue (because with less precision the answer is correct)? – Paul Warnick Jun 20 '16 at 19:12
  • 2
    How have you decided that the Scilab figures are "correct"? All fixed size digital floating point mathematic will have errors The larger the number of bits the smaller those errors should be. If moving to higher precision gives you different results then, my feeling would be, that was because your original Scilab calculation has "significant" errors that are not as significant in the higher precision calculation (all other things being equal). I would question if the original Scilab results were actually *correct* or whether you want to replicate the same level of error that Scilab produces? – Galik Jun 20 '16 at 19:21
  • @Galik That's actually what I'm trying to do. Replicate the same level of error as the Scilab code. And I suppose that answers my questions. Since the Scilab code is only using double precision. Using a long double in C++ would increase the difference between the two programs. – Paul Warnick Jun 20 '16 at 19:24

1 Answers1

1

Yes, this is totally possbile.

But since floating point numbers are never totally precise, you should design your algorithm with that in mind and try to avoid that those "rounding errors" screw your calculations after some time.

Even more, as already noted in the comments: Don't expect your C++-program to produce the exact same results as the Scilab program does, especially if it is that critical to small changes. Thats why most numerical simulations are only precise to some limit before they start to produce "wrong" results.

In oder to give you some useful advice, C++ has the very useful option of typedefs. Use a typedef like typedef long double myFloatType and only use myFloatType for your calculations (think of a better name, that actually tells more about what its used for here!). Then you can easily change it afterwards, by just changing one line of code, and compare the results.

If the difference is significant, it might be worth thinking about a better algorithm.

Anedar
  • 4,235
  • 1
  • 23
  • 41