1

In a 32 bit processor as I understand, each instruction is 32 bits. So, for the MOV instruction in assembly, how do you only use 32 bits for the op code plus the parameters? So for:

MOV register, [address]

Doesn't the address take up 32 bits by itself? So wouldn't that take up the entire instruction? Maybe I have it all wrong, but I'm trying to implement a VM in C/C++. Help?

Seki
  • 11,135
  • 7
  • 46
  • 70
  • 7
    "each instruction is 32 bits" No, instruction length depends on opcode and varies. For example, `nop` is one byte only. – Roman R. Mar 26 '14 at 22:00
  • Instructions can be variable length. – Mysticial Mar 26 '14 at 22:00
  • @Roman R. Then are any more than 32 bits? – user3466304 Mar 26 '14 at 22:02
  • 2
    If somebody said "32 bit instructions", they probably meant "instructions for a 32-bit-addressing architecture" – aschepler Mar 26 '14 at 22:03
  • 2
    not all CPUs use the same scheme. A processor's bitness doesn't necessarily indicate instruction width. For example, 64-bit MMIX uses 32-bit fixed length instructions and 32-bit x86 uses variable length instructions. – bames53 Mar 26 '14 at 22:28

7 Answers7

6

x86 instructions have variable length. The CPU starts reading instruction with first byte, identifies the "opcode" then keeps reading following bytes depending on actual instruction.

I stopped debugger (Visual Studio) at random point and its disassembly window has an option "Show Code Bytes" which gives an example of instruction length. Have a look below:

enter image description here

In particular, have a look at line with mov [ebp-15Ch], eax which is close to mentioned in your question. The corresponding bytes include A4 FE FF FF which is 32-bit value for -15Ch.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Roman R.
  • 68,205
  • 6
  • 94
  • 158
  • Oh cool! I didn't know that, but how many bytes is the opcode? – user3466304 Mar 26 '14 at 22:10
  • 1
    For opcode length, you might want to start with looking at [x86 Opcode Cheat Sheet](http://pnx.tf/files/x86_opcode_structure_and_instruction_overview.png). – Roman R. Mar 26 '14 at 22:12
5

Actually, opcodes can be variable length or fixed length depending on the architecture. Also there are 16- and 32-bit processors with 20-bit address bus. The architecture bit width is not very clear these days. It's more of historical thing. I guess the best "definition" these days might be the width of a "logical" internal databus. (Remember 8088: 16-bit device with 8-bit multiplexed data bus.)

turboscrew
  • 676
  • 4
  • 13
  • I think the most sane modern definition of the width of the architecture is the size of the general-purpose integer registers. So e.g. the Linux [x32 ABI (long mode with 32-bit pointers)](https://en.wikipedia.org/wiki/X32_ABI) is still running in 64-bit mode. 64-bit integer math is efficient. Historically more weight was given to the width of data busses, but that's not as meaningful these days when a high-speed serial bus can be as fast as a lower-clocked wide parallel bus. – Peter Cordes Sep 04 '16 at 13:19
  • Fair question. I don't know DSPs. But I was probably overstating the case that there was any consistent way to decide what to call a CPU in terms of width. – Peter Cordes Sep 09 '16 at 16:26
4

Some processors, such as x86, solves this problem by having variable length instructions, so instructions are not 32 bits long - x86 have some instructions that are a single byte long, and some instructions are over 10 bytes. (This also means that instructions aren't always aligned on 32-bit boundaries, obviously).

Other processors solve it by "two-part constant loading", for example ARM, MIPS and 29K have instructions that load the "low part" and "high part" as separate entities (typically, the loading low part clears or sign-extends the upper part, and the high part leaves the low part unchanged, that way, small values can be loaded in a single instruction).

Of course, a lot of the time, we're not dealing with constant addresses anyway, but with variables that hold addresses (aka pointers or references), in which case a "load" instruction loads from a an address that is in a register, rather than a constant value.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • And then there is PC relative indirect with auto-increment... It's not really an instruction with full address included in the instruction, but in assembly it can be used like that. Old PDP came quite close: MOV (PC++) R0. It loaded a (16-bit) word from address in PC after instruction fetch and then incremented PC by one word) to skip the data loaded. It just lacked the indirect. – turboscrew Sep 09 '16 at 16:27
  • Yes, one of my favourite instruction sets. It lacked the DIRECT addressing modes - and load from absolute memoyr address actually encodes as `MOV @(PC++), Rn`, with the address in the next word. – Mats Petersson Sep 10 '16 at 04:38
  • Oops, forgot. It DID have the indirection too. It's decades since I was in touch with PDP assembly... A very good example how to "simulate" a full address in the opcode... A similar kind of trick was used in CPM to pass SW interrupt "parameters". The return address in stack was used as a parameter pointer which was then incremented (to skip the parameter) after reading the parameter in the interrupt routine. – turboscrew Sep 11 '16 at 16:29
3

In assembler like ARMv7 which is strictly 32 bits, you cannot store the op-code and an absolute address in a single instruction. What you have to do is either

  1. load the address from memory into a register and then jump to the address in the register.
  2. store an address relative to the the program counter [pc]

The ARM architectural manual can help with this.

doron
  • 27,972
  • 12
  • 65
  • 103
3

There are good answers here explaining instruction format. However none seem to clarify what seems to be your confusion: on a 32 bit architecture the instruction operands are 32 bits in length (*), not the instructions. An instruction is composed of an operation code [and operands] (not all instructions have operands e.g. nop, sti).

(*) this doesn't hold true as a no matter what rule. For instance the 32 bit x86 architecture has a instruction set extension (SSE) that takes 128 and even 256 bits operands.

bolov
  • 72,283
  • 15
  • 145
  • 224
2

Risc processors, like arm, have fixed size instructions. X86 is a cisc processor (variable size instructions). In the case of a risc processor, the 32 bit address is split in 2 parts (16 bits - hi and low) and is loaded by executing 2 load instructions into one register and then we can load(mov) the contents of the address in another register. So it can take up to 3 instructions to move something into a register from memory.

steli
  • 21
  • 2
-1

The CPU doesn't "communicate" with the OS. A CPU is just circuitry. The instruction-decode circuitry activates other circuits based on pattern-matches against the instruction bits that it read from memory.

They're all standardized so the same OS can run on any x86 CPU, for example.

However, a 32-bit CPU would processing lower of process than a 64-bit CPU. The CPU can be divided into units, then they could be accessed like a memory IC.

So it's still hard to write software for your hardware if you're not supported by a team.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Lan...
  • 105
  • 1
  • 6
  • This is nearly unreadable. You need to [edit] it to leave spaces after a period, and capitalize the first word of the next sentence. It starts out making a useful point, but then quickly stops making any sense. IDK if it got lost in translation or what. – Peter Cordes Sep 04 '16 at 13:26
  • i don't deserve to take a negative number one peter cordes - the man who knows how the things is work. – Lan... Sep 06 '16 at 10:12
  • Flattery will get you everywhere... but I wasn't able to completely fix your answer for you, because I still don't even know what point you're trying to make in the second half. I applied proper formatting and English grammar / structure where possible, but I just gave up on the 3rd paragraph, so both those sentences are not correct English, and aren't even close to anything correct. This probably still deserves a downvote, but I did remove mine since you're trying to help (just not with much success, unfortunately). – Peter Cordes Sep 06 '16 at 12:31
  • the reason is because I'm not using english in my country.And i think the purpose to create the language is to communicate,and almost of people i have meet are all understand what i said,so what is the deal? – Lan... Sep 06 '16 at 12:39
  • I understand that not everyone is fluent in English. I don't mind helping improve the English in an answer so it's easier to read for everyone. But lack of English skills does become a problem when meaning does get lost in translation. That's what happened here. I tried, but I honestly can't tell what some of this answer is even trying to say. I can't tell if your idea is wrong, or if it just got lost in translation. – Peter Cordes Sep 06 '16 at 12:45
  • Also, this question didn't really need another answer, and nobody else said anything about the CPU communicating with the OS in the first place. I don't know why you started off talking about that. – Peter Cordes Sep 06 '16 at 12:48
  • Another thing: Your English appears weak enough that I'm not sure someone would be able to understand it if they didn't already understand the subject you're talking about. It's a lot easier to figure out what someone is saying when you already know what they're probably trying to say. A beginner would have an even harder time following your answer than I did. Since I know the answer, I know there were only a few different things you might be saying about circuits and pattern matching. – Peter Cordes Sep 06 '16 at 12:54
  • There are localized versions of Stack Overflow in other languages. If you can write more clearly in [one of them](http://meta.stackexchange.com/a/117167/280924), that might let you write answers that help more people, because you can say it more clearly. (I haven't looked at your other English answers, so I don't know if they're usually easier to read than this one was, but this one started out a total mess. Leaving a space after a `.` and capitalizing the first word of a sentence is really important for readability, even besides using the right words in the right order.) – Peter Cordes Sep 06 '16 at 13:07
  • They would understand,my english is not too bad like you said.About the final part of your message,i just said what i know,i had designed a cpu structure by my self. – Lan... Sep 06 '16 at 13:08
  • Anyway, I'm not trying to be rude, but your English needs some improvement for it to be nice to read. – Peter Cordes Sep 06 '16 at 13:08
  • `They would understand`. I'm sure most people usually understand what you're trying to say, but that wasn't what happened here, unfortunately. I know a lot about CPUs, assembly language, and hardware, but I still can't figure it out. If you make another edit so I can figure out what you meant, I can help you put it into nicer English. – Peter Cordes Sep 06 '16 at 13:11
  • your mean is you can't imagine how the transistors would be ordered to processing,isn't it? – Lan... Sep 06 '16 at 13:20
  • That's not what I meant. Is your 3rd paragraph trying to say that 32-bit CPUs process data in smaller chunks than 64-bit CPUs? I didn't think you were even trying to describe how transistors were arrange to make a 32-bit adder circuit or something. – Peter Cordes Sep 06 '16 at 13:24
  • we spam too much,they use the address to routing the process to the small cpus in the cpu which contain a billion transistors. – Lan... Sep 06 '16 at 13:40