Some programs are intended solely for use with input that is known to be valid, or at least come from trustworthy sources. Others are not. Certain kinds of optimizations which might be useful when processing only trusted data are stupid and dangerous when used with untrusted data. The authors of Annex L unfortunately wrote it excessively vaguely, but the clear intention is to allow compilers that they won't do certain kinds of "optimizations" that are stupid and dangerous when using data from untrustworthy sources.
Consider the function (assume "int" is 32 bits):
int32_t triplet_may_be_interesting(int32_t a, int32_t b, int32_t c)
{
return a*b > c;
}
invoked from the context:
#define SCALE_FACTOR 123456
int my_array[20000];
int32_t foo(uint16_t x, uint16_t y)
{
if (x < 20000)
my_array[x]++;
if (triplet_may_be_interesting(x, SCALE_FACTOR, y))
return examine_triplet(x, SCALE_FACTOR, y);
else
return 0;
}
When C89 was written, the most common way a 32-bit compiler would process that code would have been to do a 32-bit multiply and then do a signed comparison with y. A few optimizations are possible, however, especially if a compiler in-lines the function invocation:
On platforms where unsigned compares are faster than signed compares, a compiler could infer that since none of a
, b
, or c
can be negative, the arithmetical value of a*b
is non-negative, and it may thus use an unsigned compare instead of a signed comparison. This optimization would be allowable even if __STDC_ANALYZABLE__
is non-zero.
A compiler could likewise infer that if x
is non-zero, the arithmetical value of x*123456
will be greater than every possible value of y
, and if x
is zero, then x*123456
won't be greater than any. It could thus replace the second if
condition with simply if (x)
. This optimization is also allowable even if __STDC_ANALYzABLE__
is non-zero.
A compiler whose authors either intend it for use only with trusted data, or else wrongly believe that cleverness and stupidity are antonyms, could infer that since any value of x
larger than 17395 will result in an integer overflow, x
may be safely presumed to be 17395 or less. It could thus perform my_array[x]++;
unconditionally. A compiler may not define __STDC_ANALYZABLE__
with a non-zero value if it would perform this optimization. It is this latter kind of optimization which Annex L is designed to address. If an implementation can guarantee that the effect of overflow will be limited to yielding a possibly-meaningless value, it may be cheaper and easier for code to deal with the possibly of the value being meaningless than to prevent the overflow. If overflow could instead cause objects to behave as though their values were corrupted by future computations, however, there would be no way a program could deal with things like overflow after the fact, even in cases where the result of the computation would end up being irrelevant.
In this example, if the effect of integer overflow would be limited to yielding a possibly-meaningless value, and if calling examine_triplet()
unnecessarily would waste time but would otherwise be harmless, a compiler may be able to usefully optimize triplet_may_be_interesting
in ways that would not be possible if it were written to avoid integer overflow at all costs. Aggressive
"optimization" will thus result in less efficient code than would be possible with a compiler that instead used its freedom to offer some loose behavioral guarantees.
Annex L would be much more useful if it allowed implementations to offer specific behavioral guarantees (e.g. overflow will yield a possibly-meaningless result, but have no other side-effects). No single set of guarantees would be optimal for all programs, but the amount of text Annex L spent on its impractical proposed trapping mechanism could have been better spent specifying macros to indicate what guarantees various implementations could offer.