Using FPU return values in c++ code

Question

I have an x86 NASM program which seems to work perfectly. I have problems using the values returned from it. This is 32-Bit Windows using MSVC++. I expect the return value in ST0.

A minimal example demonstrating the problem with the returned values can be seen in this C++ and NASM assembly code:

#include <iostream>
extern "C" float arsinh(float);

int main()
{
    float test = arsinh(5.0);
    printf("%f\n", test);                    
    printf("%f\n", arsinh(5.0));             
    std::cout << test << std::endl;          
    std::cout << arsinh(5.0) << std::endl;   
}

Assembly code:

section .data
value: dq 1.0
section .text
global _arsinh
_arsinh:
    fld dword[esi-8]      ;loads the given value into st0
    ret

I can't figure out how to use the return value though, as I always get the wrong value no matter which data type I use. In this example the value 5 should be returned and I'd expect output like:

5.000000

5.000000

5

5

Instead I get output similar to:

-9671494178951383518019584.000000

-9671494178951383518019584.000000

-9.67149e+24

5

Only the final value appears to be correct. What is wrong with this code? Why doesn't it always return the floating point value I am expecting from my function? How can I fix this code?

What do you mean by *does not work* and *does work* is beyond my comprehension. *it works completly perfect, I just have problems using the values returned from it* is akin to saying *I can fly, only I fall when I try*. — SergeyA, Jun 16 '16 at 17:42
Then perhpas you should post the function `float arsinh(float)` — Weather Vane, Jun 16 '16 at 17:43
It compiles, but shows the wrong value. Only the last one shows the actual right value. Why is it necessary to post my assembler function? It returns a value in st0,(that's the only important thing for this question) which is only correctly displayed by the last output, all the other variations outprint wrong values. — Hajaku, Jun 16 '16 at 17:46
You might want to return the value on the stack, or so [this](http://stackoverflow.com/questions/30322006/call-function-with-float-in-assembly-x86-x87#30324509) answer suggest. — Jonas Byström, Jun 16 '16 at 17:47
I guess, first step for you is to stop equating *compiles* and *works*. Second step is to provide the expected result vs actual result and explain why expected result is expected. — SergeyA, Jun 16 '16 at 17:48
Why post the code for `arsinh`? Because the code you posted reveals nothing. — Weather Vane, Jun 16 '16 at 17:49
The function calculates the area sinus hyperbolicus of a given float. The displayed values are: 1- 194.130431 2- 194.130432 3- 194.13 4- 2.31241 . The fourth value is the actual result of computing the area sinus hyperbolicus using another math library. All the other values are obviously not what I am expecting. — Hajaku, Jun 16 '16 at 17:50
This question is rapidly getting closed. Please provide the "perfect" code. — Weather Vane, Jun 16 '16 at 17:51
That's a funny problem. Looks like something to do with the way you are returning the value in assembly. — Eugene Sh., Jun 16 '16 at 17:57
How can the correct fourth result be from "another math library?" You call the same function. If not, you have posted complete fiction and expect a seer to find your bug. — Weather Vane, Jun 16 '16 at 17:59
It's not from another math library - it's the correct value calculated by my function. I just know that it's the correct value because I used another library to verify that that is indeed the value of the area sinus hyperbolicus of 5.0. (Which is what I meant by that) — Hajaku, Jun 16 '16 at 18:01
@WeatherVane I believe it means that the result printed is equal to a result computed by a different lib. — Eugene Sh., Jun 16 '16 at 18:01
Just a guess, only in the last of the 3 calls to `arsinh` is the result sent directly to C++ style output. — Weather Vane, Jun 16 '16 at 18:07
@WeatherVane Yeah, that's why I said it's a funny problem. why should it matter other than because of some quirk in calling conventions? — Eugene Sh., Jun 16 '16 at 18:09
The problem is that the code is 100+ lines long. But all of that should not really matter. The only register I change is EAX and at the end of my function I clear the FPU stack completly and fld dword [result]. I can call the function 1000 times in a loop and it will always print the right results, but only using the 4th method, all the other ones consistently give me a wrong value. — Hajaku, Jun 16 '16 at 18:09
Well, it could be some kind of undefined behavior somewhere.. — Eugene Sh., Jun 16 '16 at 18:12

Michael Petch · Answer 1 · 2016-06-19T16:06:06.903

The primary issue is not that there is a failure returning a value in floating point register ST0, but in the way you attempt to load the 32-bit (single precision) float parameter from the stack. The issue is here:

fld dword[esi-8]      ;loads the given value into st0

This should read:

fld dword[esp+4]      ;loads the DWORD parameter from stack into st0

fld dword[esi-8] only works sometimes because of the way the calling function uses ESI internally. With different C compilers and optimizations enabled you may find the code fails to work altogether.

With 32-bit C/C++ code parameters are passed on the stack from right to left. When you do a CALL instruction in 32-bit code the 4 byte return address is placed on the stack. Memory address esp+0 would contain the return address and the first parameter would be at esp+4. If you had a second parameter it would be at esp+8. A good description of the Microsoft 32-bit CDECL calling convention can be found in this WikiBook entry. Of importance:

Function arguments are passed on the stack, in right-to-left order.

Function result is stored in EAX/AX/AL

Floating point return values will be returned in ST0

8-bit and 16-bit integer arguments are promoted to 32-bit arguments.

When dealing with x87 FPU instructions it is very important that the only value on the stack when returning a FLOAT is the value in ST0. Failure to release(popping/freeing) anything else you put on the FPU stack can lead to your function failing when called multiple times. The x87 FPU stack only has 8 slots (not very many). If you don't clean off the FPU stack before the function returns, can lead to FPU stack overflows when future instructions need to load a new value on the FPU stack.

An example implementation of your function could have looked like:

use32
section .text
; _arsinh takes a single float (angle) as a parameter
;     angle is at memory location esp+4 on the stack
;     arcsinh(x) = ln(x + sqrt(x^2+1)) 
global _arsinh
_arsinh:
    fldln2           ; st(0) = ln2
    fld dword[esp+4] ; st(0) = angle, st(1)=ln2
    fld st0          ; st(0) = angle, st(1) = angle, st(2)=ln2
    fmul st0         ; st(0) = angle^2, st(1) = angle, st(2)=ln2
    fld1             ; st(0) = 1, st(1) = angle^2, st(2) = angle, st(3)=ln2
    faddp            ; st(0) = 1 + angle^2, st(1) = angle, st(2)=ln2
    fsqrt            ; st(0) = sqrt(1 + angle^2), st(1) = angle, st(2)=ln2
    faddp            ; st(0) = sqrt(1 + angle^2) + angle, st(1)=ln2
    fyl2x            ; st(0) = log2(sqrt(1 + angle^2) + angle)*ln2
                     ; st(0) = asinh(angle)
    ret

Using FPU return values in c++ code

1 Answers1

Linked

Related