1

Suppose I want to perform a (single-precision) division of x by y in my CUDA kernel, and regardless of anything else, get a rounded-up result (= rounded to positive infinity). This is easy: Instead of:

float r = x / y;

I write :

float r = __fdiv_ru(x, y);

and I could do the same for rn (round nearest), rd etc.

How do I do the same thing in OpenCL?

If I look at the documentation for math functions (OpenCL 3.0), I find only find native_divide, and am told that:

The built-in math functions are not affected by the prevailing rounding mode in the calling environment, and always return the same value as they would if called with the round to nearest even rounding mode.

so, ,that's not the way to go. What do I do?

talonmies
  • 70,661
  • 34
  • 192
  • 269
einpoklum
  • 118,144
  • 57
  • 340
  • 684

1 Answers1

0

According to this page, there's an extension which allows us to do this.

with the extension enabled, this should work:

#pragma OPENCL SELECT_ROUNDING_MODE rtp
float r = x / y; 

(here, "rtp" stands for rounding to positive infinity = rounding up.)

Unfortunately, the extension is deprecated and I'm not sure we can "trust" this approach. It also apparently involves some sort of macro trickery.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • Check your results and ensure this produces the desired result. I've tried this in the past and been disappointed. My compiler ignored my pragma and did it's own thing. I ended up condemned to inline assembly with PTX (lookup "PTX Inline Assembly" the CUDA syntax works in OpenCL). – Tim Aug 11 '22 at 20:43