How do I check for overflow of integer arithmetic in CUDA?

Question

In CUDA, how can I determine whether my last integer arithmetic operation has overflowed/underflowed or not? Can I get the value of an overflow flag?

score 1 · Answer 1 · answered May 30 '18 at 16:35

A partial answer, or what I've thought of so far:

Special Cases

_{These utilize some PTX instructions which are not (AFAICT) available directly in CUDA; you would need wrapper functions, implemented using inline PTX, to use them.}

Signed 32-bit values

If you use both of the add.s32 and add.sat.s32 operators, or the sub.s32 and sub.sat.s32 operators, comparing the result tells you whether you overflowed or not. There's also fused multiply-add, which if done for 32-bit signed values has a mad.sat.s32 and a mad.lo.s32 which you could compare, if you want to check for overflow over 32-bits (which you might not quite consider overflow really). To better understand what lo means in this context, read on.

Multiplication

For multiplication, overflow is "avoided" in PTX by assuming the result is twice as wide as the operands. To the PTX multiplication instructions mad (actually, it's multiply-and-add) allow either getting just the high/low bits of the result, or if the operands are 16-bit or 32-bit wide, getting the entire double-width output. So you can just use mad.hi.yourtype and make sure it's all-zeros (or all-ones for a negative-value multiplication).

A slow approach for the general case

A slow but general solution is to compare a rough estimate of the result to the actual result. Take addition for example. You would the higher half of the bits of both operands and add that up. That would indicate either "certainly overflow" if that itself overflows to the one-past-half bit, "certainly no overflow" if the result is so far from overflowing (or underflowing) that any values for the lower bits can't make it overflow, or "maybe overflow", in which case you just need to make sure that higher half of the result is close enough to the estimated higher half.

This is doable on any processor, but should really be avoided if you can do better.

How do I check for overflow of integer arithmetic in CUDA?

1 Answers1

Special Cases

Signed 32-bit values

Multiplication

A slow approach for the general case