Why is this code being generated by avr-gcc and how does it work?

Question

This is a snippet of disassembled AVR code from a C project I'm working on. I noticed this curious code being generated, and I can't understand how it works. I'm assuming it's some sort of ridiculous optimization...

What is the explanation?

92:         ticks++;         // unsigned char ticks;
+0000009F:   91900104    LDS       R25,0x0104     Load direct from data space
+000000A1:   5F9F        SUBI      R25,0xFF       Subtract immediate
+000000A2:   93900104    STS       0x0104,R25     Store direct to data space
95:         if (ticks == 0) {
+000000A4:   2399        TST       R25            Test for Zero or Minus
+000000A5:   F009        BREQ      PC+0x02        Branch if equal
+000000A6:   C067        RJMP      PC+0x0068      Relative jump

Specifically, why does the second instruction subtract 0xFF from R25 instead of just INC R25?

In case it's not obvious, I'm referring to the second line: Subtract 0xFF from R25... why not just "INC R25" ? — Mark Renouf, Aug 26 '09 at 22:17
as pointed out by @PeterCordes http://stackoverflow.com/questions/1337831/why-is-this-code-being-generated-by-avr-gcc-and-how-does-it-work/1337882?noredirect=1#comment62982826_1337882 - code like this gets emitted virtually **only** with `-O0`, making the question itself a bit moot... with `-O3`/`-Os`, those instructions would get changed into more compact versions (branch on flags etc.) — , Jun 10 '16 at 21:15

score 5 · Answer 1 · edited Jun 10 '16 at 15:55

5

tl;dr the compiler was designed to use the more portable, efficient & general solution here.

The SUBI instruction sets C (carry) and H (half-carry) CPU flags for use with subsequent instructions (there is no ADDI in 8-bit AVR BTW, so to add an immediate value of x we subtract -x from it), whereas INC does not. Since both SUBI & INC have 2 bytes of length and execute during 1 clock cycle, you lose nothing by using SUBI - OTOH, if you use a 8-bit-sized counter, you can then easily detect if it has rolled over (by BRCC/BRCS), and if you'd have a 16- or 32-bit-sized counter, it allows you to increment it in a very simple way - with just INC, 0x00FF would get increased to 0x0000, so you'd have to check if the lowest byte is 0xFF before INCing. OTOH, with SUBI you just SUBI -1 the lowest byte, and then ADC 0 for the following bytes, assuring all the potential carry bits has been accounted for.

Further reading:

https://lists.gnu.org/archive/html/avr-gcc-list/2008-11/msg00029.html

http://avr-gcc-list.nongnu.narkive.com/SMMzdBkW/foo-subi-vs-inc

edited Jun 10 '16 at 15:55

answered Aug 26 '09 at 22:27

Greg Hewgill

951,095
183
1,149
1,285

I included the next few instructions... does that help narrow things down? I still don't get the point of "SUBI R25,0xFF" vs. "INC R25", both are 1 clock. And if R25 rolls over from 255 to 0, the the TST R25 instruction would work just fine. – Mark Renouf Aug 26 '09 at 23:32
Ok. I thought this through a bit and realized that in unsigned math, subtracting 0xFF is the same as adding 0x01. Wierd, but why do it this way? Does INC not set flags for the branch? – Mark Renouf Aug 26 '09 at 23:35
2

It looks like INC sets the V,N,S,Z flags, but SUBI sets H,V,N,S,Z,C. Since the compiler generated a TST instruction next, it appears not to be using the flags from the SUBI anyway. Also, using a SUBI seems odd because an ADDI with 0x01 would be equivalent again. It's sort of a mystery really. – Greg Hewgill Aug 26 '09 at 23:44
2

I just noticed there is no ADDI instruction. Perhaps that explains part of it. – Greg Hewgill Aug 26 '09 at 23:47
2

@GregHewgill I've taken the liberty to (of course feel free to rollback if you wish) expand your answer; since you basically hit the nail with the flags here, I've added some links, rationale etc. to substantiate it. I didn't want to post another answer since you basically answered it already, yet I felt that your answer would be better with a bit more behind-the-scenes compiler trivia. – Jun 10 '16 at 15:59
@vaxquis: that code is probably from `gcc -O0`, right? Otherwise gcc would be smart enough to skip the `tst` and branch on flags set by `subi`. – Peter Cordes Jun 10 '16 at 20:33
1

@vaxquis: yeah, I look at gcc asm output all the time on http://gcc.godbolt.org/ (which includes an [ancient AVR-gcc](https://godbolt.org/g/17krjp)), I just don't have experience with AVR. – Peter Cordes Jun 10 '16 at 21:27
@PeterCordes my avr-gcc 4.9.2 emits almost identical assembly, at least for this example; anyway, in the general case, it's just like you said: even at `-O1`, it's just `subi r17,lo8(-(1))` and immediately `brne .L3`, without any additional `sts` nor `tst` (https://godbolt.org/g/ieM6PJ) – Jun 10 '16 at 22:21

score 5 · Accepted Answer · answered Aug 27 '09 at 05:53

5

The SUBI instruction can be used to add/subtract any 8 bit constant to/from an 8 bit value. It has the same cost as INC, i.e. instruction size and execution time. So SUBI is preferred by the compiler because it is more general. There is no corresponding ADDI instruction, probably because it would be redundant.

answered Aug 27 '09 at 05:53

starblue

55,348
14
97
151

1

you're right that both have the same size/CPU clocks - yet saying "`SUBI` is preferred (...) because it is more general" is just handwaving without describing any specifics; `SUBI` ain't "more general", because it requires registers r16-31 instead of the the `INC`'s r0-31 range. avr-gcc compiler was *hand-tuned* to do `SUBI` instead of `INC`, for the very reason given by Greg below - `SUBI` sets the carry flag properly, whereas `INC` just ignores it. It makes 16/32-bit counters much simpler with `SUBI`, because you don't have to check lower bytes for overflow at all! – Jun 10 '16 at 15:40

Why is this code being generated by avr-gcc and how does it work?

2 Answers2