Constant Divide Not Optimized?

Question

I have the following line of code:

#define A 360
#define B 360

temp = (s16_myvar * A) / B;

My compiler (Windriver DIAB PPC in this case, using standard extended optimization settings -XO) does not seem to optimize this away to something like temp = s16_myvar. When I look at the assembly listing, it seems to be faithfully putting 360 in a register and then after doing the multiply, dividing the result by 360.

Is there a trick I could use which would get rid of the multiply and divide in the final code?

And for those of you asking, "why?", suppose that in some configurations, B is not == A and you need to scale a variable.

Perhaps it is not optimized because the compiler is concerned about overflow, in which case, even if `A == B`, the result may be different from `s16_myvar`. — owacoder, Oct 08 '15 at 22:42
@iharob - Not necessarily. If `A` and `B` both evaluate to integers, the result may not be the desired result either, because of integer division truncation. — owacoder, Oct 08 '15 at 22:45
http://port70.net/~nsz/c/c11/n1570.html#5.1.2.3p15 and surrounding paragraphs might help. Some rules of school mathematics do not apply to computer arithmetics. — too honest for this site, Oct 08 '15 at 22:46
Typical solution would be to use fixed-point arithmetics, e.g 16.16 in a 32 bit variable. Jut scale everything by 2**16, then divide the constants first. Watch out for integer overflow (UB for singed integers), however! — too honest for this site, Oct 08 '15 at 22:50
@Olaf, I understand what you mean. I was hopeful that Diab would recognize in some "meta" sense that it could replace the whole thing with temp = s16_myvar! — MAF, Oct 08 '15 at 22:55
Try leaving away the parenthesis. Too lazy to check in the standard, but evaluation rules might forbid optimization with them. — too honest for this site, Oct 08 '15 at 22:59
Depending on the type of `s16_myvar` and the options of compilation (e. .g, wrap signed integer overlows) on the some processors the implementation would be NOT allowed to perform this optimization. — ouah, Oct 08 '15 at 23:02
@ouah: Signed integer overflow is **always** UB according to the standard, there is no "option". However, if `A == B`, they will cancel out anyway. But the compiler might feel enforced to behave that way due to the parenthesis. Embedded compilers are very "conservative" (euphemism) when it comes to optimisations, much more than gcc for instance. That's why they are so expensive ;-) — too honest for this site, Oct 08 '15 at 23:11
@Olaf many compilers have options to give signed integer overflow a defined behavior, for example gcc `-fno-strict-overflow` option. Standard allows compilers to augment the language to have definitions for UB. — ouah, Oct 08 '15 at 23:16
@ouah: I did not say different: "... always UB according to the standard ...". Point is, that the compiler could optimize this very well, because it generates no different result unless overflow occurs by the multiplication and **if** overflow occurs, things are lost anyway. Things are different for mul after div, which changes result due to integer truncation. These expensive commercial compilers are often very bad at optimization (in case you missed the irony). — too honest for this site, Oct 08 '15 at 23:22
@Olaf in the specific case of OP expression, it could never overflow because of the integer promotion of `s16_myvar` (in OP system), the compiler would have been allowed to optimize it anyway. — ouah, Oct 08 '15 at 23:25
@ouah: Assuming 32 bit integers, yes. But even with 16 bit integers, it would have been no matter. This is one of the (few) advantages of UB. — too honest for this site, Oct 08 '15 at 23:33
Is it at least optimizing the constant-division itself? If not then you should not be surprised that it doesn't do some other optimization either — harold, Oct 09 '15 at 15:12
With gcc you can use `-Ofast` or add `-funsafe-math-optimizations` or `-ffast-math` ... this will make optimizations that will not account for order of operations, overflows, and other subtleties. For clang `-Ofast` and `-ffast-math` works, but `-funsafe-math-optimizations` does not. Your compiler may have similar options. — technosaurus, Mar 09 '17 at 16:19

score 2 · Answer 1 · 2015-10-09T06:59:25.937

2

Just a supposition: integer expressions like (a x) / b can be simplified to (a / b) x when b divides a and a x does not overflow. It can be that the optimizer designers just didn't go that deep or considered such expressions as unlikely/stupid.

Update:

After a remark by Olaf, the overflow condition is irrelevant, as it is a case of undefined behavior so that the run-time is free to return any value.

edited Oct 09 '15 at 06:59

answered Oct 08 '15 at 23:04

1

`s16_myvar` is promoted to `int` 32-bit in his system before the multiplication, no overflow possible. – ouah Oct 08 '15 at 23:08
@ouah: why would it be promoted ? – Oct 08 '15 at 23:15
Rules of integer promotion (`s16_myvar` is 16-bit and `int` is 32-bit in OP system). – ouah Oct 08 '15 at 23:17
3

If `a x` overflows, you are in the world of UB anyway, so that does not matter for optimisation. However, I agree with you that the compiler is just not good at optimizing this quite common case. Changing to division first, is no option most times, as you cannot guarantee the first constraint (which is still valid). – too honest for this site Oct 08 '15 at 23:44
1

@Yves Daoust: The promotion is not caused by the presence of integral constants. Types smaller that `int` are always unconditionally promoted to `int` regardless of context. – AnT stands with Russia Oct 09 '15 at 01:03

Mike Dunlavey · Answer 2 · 2015-10-09T00:52:57.540

1

Let signed 16-bit variable s16_myvar be 32700. Suppose A and B are perfectly good 32-bit signed ints, like 360000.

Then the variable is promoted to an int, and the multiplication occurs, giving you 11,772,000,000, which wraps around to -1,112,901,888.

Divide that by B and you get -3091.

Is that what you wanted? You may know that the numbers won't wrap around, but the compiler can't assume it.

edited Oct 09 '15 at 00:52

answered Oct 09 '15 at 00:27

Mike Dunlavey

40,059
14
91
135

1

I was on the same line initially. But after a remark by @Olaf, I understood that the overflowing expressions are a case of undefined behavior, and the code generator is allowed to take any action, including giving the right answer ! – Oct 09 '15 at 06:56
But the compiler knows the values. They are small. After preprocessing, the expression is `(s16_myvar * 360) / 360`. On a 32bit platform (and if `s16_myvar` really is s16), this is absolutely equivalent to just `s16_myvar`. – undur_gongor Oct 09 '15 at 09:34
@undur_gongor: When you're a compiler-writer, writing optimizations, you tend to stay away from gray areas, because you will have 10^3 - 10^6 users, some of them possibly doing tricky things, like wrapping-around, as the machine instructions would do. The "bug reports" will drive you crazy. Why ask for trouble? – Mike Dunlavey Oct 09 '15 at 12:30

score 0 · Answer 3 · answered Oct 09 '15 at 15:05

0

I am going to try this function form as a possible solution. My idea is that the compiler may optimize out most of the instructions if it notices that the scale_a and scale_b are identical. I'll post back my result.

__inline__ S16 RESCALE_FUNC(short s16_input, const short *scale_a, const short *scale_b)
{
    return (scale_a==scale_b)?(s16_input):((s16_input*(*scale_a))/(*scale_b));
}   

temp = RESCALE_FUNC(s16_myvar, A, B);

answered Oct 09 '15 at 15:05

MAF

59
3

The early results seem promising... _in situ_ there do not seem to be any additional assembly instructions; it seems like the compiler optimized the function out if the scales are the same. I need to try a few more examples with different scales back to back to make sure things are actually working. – MAF Oct 09 '15 at 15:49

score 0 · Answer 4 · answered Mar 07 '17 at 02:23

This is good behavior to me. Evaluation order matters. Suppose you are doing fixed point arithmetic (or any regular integer arithmetic), and you want to calculate "Ouput = 80% from your Input". Then you do: Ouput= (Input*80)/100; Lets assume Input = 201. If the compiler decides to do 80/100 first: Output = 201*(80/100) = 201*0 = 0. This is because 80 and 100 are integers (same applies to int variables).

But since you explicitly put some parentheses, then you get: Output = (Input * 80) / 100 = (201 * 80) / 100 = 16080 / 100 = 160.

There, 160 is "approximately" 80% of 201 (remember, we are using integers not floats).

Hello, and welcome to StackOveflow. Please format your code to be more readable. For help with formatting, please see http://stackoverflow.com/help/formatting. — Chait, Mar 07 '17 at 02:51

Constant Divide Not Optimized?

4 Answers4