Should all code compiled for 32 bit machines be in 4 byte chunks?

Question

I have a simple 32-bit assembly code that I wrote:

movl  $0x542412e6, %eax
movl  %ebp , %edx
addl  $0x30, %edx
movl  %edx, %ebp
pushl 0x08048dd6
ret

When I run this command:

gcc -m32 -c e.s

I get the following 18 bytes:

0:  b8 e6 12 24 54          mov    $0x542412e6,%eax
5:  89 ea                   mov    %ebp,%edx
7:  83 c2 30                add    $0x30,%edx
a:  89 d5                   mov    %edx,%ebp
c:  68 d6 8d 04 08          push   $0x8048dd6
11: c3                      ret

Why is the object code 18 bytes and not 20 or 16? Shouldn't it always be in 4-byte words for a 32-bit machine?

Nope. The code must be on the boundaries defined by the architecture. That may have some relationship to the "word size" of the machine, but the relationship is rarely that strong. (And in particular the x86 instruction set has its heritage back in 8 and 16-bit machines, and there is some degree of forward/backward compatibility.) — Hot Licks, Nov 19 '13 at 03:02

phuclv · Accepted Answer · 2019-04-02T11:38:25.493

Instruction size does not related to data or address bus size. Some 16-bit x86 CPUs have 3 totally different sizes with 8-bit data bus, 20-bit address bus and variable length instruction size. Modern 32-bit or 64-bit x86 have variable length instruction too for backward compatibility.

Just look at the movl $0x542412e6, %eax and pushl 0x08048dd6 line and you'll see that it's impossible to encode 32-bit immediate data, opcode and register within 32 bits of data. If an architecture uses 32-bit fixed-length instruction then it must use multiple instructions or a literal pool to load 32-bit constant.

RISC architectures often have fixed width instructions as a trade-off between code density and decoder simplicity. But 32-bit RISC architectures with instruction size different from 32-bit also exist. For example MIPS16e and ARM thumb v1 have 16-bit instructions whereas ARM thumb2 and dalvikVM have variable length instructions. Modern 64-bit RISC architectures also won't have 64-bit instructions but rather often stick with the 32-bit size

score 2 · Answer 2 · answered Nov 19 '13 at 03:03

2

x86 does not have fixed length instructions nor does it require alignment. An architecture needs to have its instructions match a certain offset. This though is why x86 process require much more logic to decode instructions that RISC processors.

Now most RISC architectures do have fixed length instructions and would be alghned.

answered Nov 19 '13 at 03:03

shf301

31,086
2
52
86

Most RISC machines I'm familiar with have both 16 and 32-bit instructions. Including the mother of them all -- the IBM 801. – Hot Licks Nov 19 '13 at 03:08
1

The original RISC machines were devices like the IBM 7000 series machines with simple registers and simple instruction formats. Those machines were 36 bits. See http://en.wikipedia.org/wiki/IBM_7090 – Ira Baxter Nov 19 '13 at 04:54
@HotLicks: ARM thumb and MIPS 16-bit compact instructions were added later, as an optimization for embedded systems, costing CPU complexity to support two modes, and/or to support a mix of short and long instructions. Classic RISC systems like early MIPS, Alpha, SPARC, RISC-V, and many others only have 32-bit instruction words. Same for current AArch64: only 32-bit instruction words are supported. (Even early ARM was much less RISCy than systems like MIPS; it has a single instruction that does from 1 to 15 stores, so it's pretty unavoidably microcoded. And complex addressing modes...) – Peter Cordes Apr 02 '19 at 12:34
@PeterCordes - 16-bit instructions were used because that was all that was needed for most cases, and fewer bits meant that memory throughput was less of a bottleneck. 32-bit instructions were added to allow literals and long addresses to be included in the instructions, but most instructions did not need these. – Hot Licks Apr 02 '19 at 12:44
@HotLicks: I wasn't talking about IBM 801, I was only talking about later CPUs whose first generation was something like a https://en.wikipedia.org/wiki/Classic_RISC_pipeline (and ISAs that mimic that, like DEC Alpha.) They typically have an I-cache, and simple decode was a priority. Classic MIPS I actually used instruction bits as internal control signals directly, so decoding was very easy. This would not have been possible with compact 16-bit instructions, without an extra processing stage. But branches need to compute taken/not-taken in the ID stage, so there's only 1 delay branch slot – Peter Cordes Apr 02 '19 at 12:49
@PeterCordes - Having been pitched on the 801 since it's inception, I can tell you that the big bottleneck was memory throughput. That was the reason for the dual sized instructions. – Hot Licks Apr 02 '19 at 13:01
@HotLicks: Yes, that makes perfect sense there. It's about 10 years older than the "classic RISC" 5-stage pipeline machines I was talking about, and I think doesn't have an I-cache. 8086 was another CPU where instruction fetch is the major bottleneck, despite its compact variable-length instruction encoding. (It's fully CISC of course, going much further in the direction of harder to decode but higher code density than 801.) – Peter Cordes Apr 02 '19 at 13:05

Should all code compiled for 32 bit machines be in 4 byte chunks?

2 Answers2

Related