2

I wrote a simple cooperative multi-threading library. Currently I always save and restore the fpu state with fxsave / fxrstor when switching to a new context. But is this necessary in the cdecl calling convention?

As a simple example:

float thread_using_fpu(float x)
{
    float y = x / 2; // do some fpu operation
    yield();         // context switch, possibly altering fpu state.
    y = y / 2;       // another fpu operation
    return y;
}

May the compiler make any assumptions about the FPU state after the call to yield()?

user5434231
  • 579
  • 3
  • 16
  • 3
    No, the usual convention mandates FPU state be empty upon entry and exit (unless used for return value obviously). – Jester Mar 12 '16 at 22:17
  • Thanks, do you happen to have a source for this? I couldn't find much about this myself. Not having to save and restore a 512-byte buffer every time would really help improve performance, and I want to be 100% sure this won't cause any issues. – user5434231 Mar 12 '16 at 23:15

2 Answers2

3

No. You don't have to do any saving of the state. If one thread is in the middle of a floating point calculation where there is, for example, a denormalized flag set, and that thread is interrupted, then when it resumes the O/S or kernel will set the flags, just like it will restore other registers. Likewise, you don't have to worry about it in a yield().

Edit: If you are doing your own context switching, it is possible you would need to save the precision and rounding control flags if you need to set them to non-default values. Otherwise, again you're fine.

Rob L
  • 2,351
  • 13
  • 23
3

As per the The SYSTEM V APPLICATION BINARY INTERFACE Intel386TM Architecture Processor Supplement, page 3-12:

%st(0): If the function does not return a floating-point value, then this register must be empty. This register must be empty before entry to a function.

%st(1) through %st(7): Floating-point scratch registers have no specified role in the standard calling sequence. These registers must be empty before entry and upon exit from a function.

Thus, you do not need to context switch them.

Another, newer version says this:

The CPU shall be in x87 mode upon entry to a function. Therefore, every function that uses the MMX registers is required to issue an emms or femms instruction after using MMX registers, before returning or calling another function. [...] The control bits of the MXCSR register are callee-saved (preserved across calls), while the status bits are caller-saved (not preserved). The x87 status word register is caller-saved, whereas the x87 control word is callee-saved. [...] All x87 registers are caller-saved, so callees that make use of the MMX registers may use the faster femms instruction.

So, you may need to save the control word.

Jester
  • 56,577
  • 4
  • 81
  • 125