operators for BF16 floating-point values & who defines __CUDA_NO_BFLOAT16_OPERATORS__?

Question

I'm experiencing inconsistent behavior w.r.t the availability of bfloat16 operators when compiling kernel code with NVRTC, on different machines - but with the same CUDA version, 11.2 (when including cuda_bf16.h)

On one machine, this operator from cuda_bf16.hpp:

__device__ __forceinline__ __nv_bfloat16 &
operator+=(__nv_bfloat16 &lh, const __nv_bfloat16 &rh) 
{ lh = __hadd(lh, rh); return lh; }

is available; but on the other machine - it's missing, and my compilation fails. If I define the operation myself - that will work on one machine, and fail on the other because of a redefinition.

This is already enough to ask the question: What could cause the operator to only be available on one of the machines?

But I have what is perhaps a clue: The definition of the operators is enclosed in:

#if !defined(__CUDA_NO_BFLOAT16_OPERATORS__)
#endif /* !defined(__CUDA_NO_BFLOAT16_OPERATORS__) */

and I'm suspecting that something, somewhere, is defining that. What could it be? I can't find this string anywhere in /usr/local/cuda-11.2/include other than these two lines.

Note: The machines' driver version differs. One of them is 460.73.01, and the other is 535.54.03.

The obvious answer is that the two systems are not the same. Short of a repro case and a lot of instrumentation, I don't see how to pin it down. But after 250 odd CUDA questions, you knew that already, I guess — talonmies, Aug 30 '23 at 01:47
@talonmies: The question is, given the aspects which are the same (CUDA distribution) - what are the aspects which might differ? And more specifically, how would this definition be set? The answer to that, especially the last part, doesn't come from a potential example. — einpoklum, Aug 30 '23 at 08:37
As I said, the way to find out when and whether the protecting symbol is set is by instrumentation (given neither of us have access to the compiler source). That is why a repro case is critical to empirically deriving an answer. Otherwise a bug report to NVIDIA is the only other way forward — talonmies, Aug 30 '23 at 08:41
@AbatorAbetor Yes they do, and quite significantly. Edited to list the driver versions. I had assumed that could not matter for the NVRTC compilation stage, but - maybe I'm wrong. — einpoklum, Aug 30 '23 at 08:58

operators for BF16 floating-point values & who defines __CUDA_NO_BFLOAT16_OPERATORS__?

0 Answers0