0

When I have an operation like mov eax, [input+10]:

Does it have a different opcode than this operation:

mov eax, [input] (considering that input now has the value of former input+10)?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
MenNotAtWork
  • 145
  • 2
  • 8

1 Answers1

5

These two instructions should generate the exact same machine code.

That's because 'input' is a symbol which stands for an address, and the constant '10' is added to it by the assembler. In both cases, the instruction is mov register, [displacement]. The addressing mode being used is called "Direct", a.k.a. "Displacement-Only".

The CPU does not have any addressing mode (nor special opcode) for mov register, [displacement + offset].

(And it would not make any sense to support such an addressing mode, because both displacement and offset are constants.)

EDIT:

A special case arises when 'input' is declared as exportable in one assembly file, and then it gets imported in another assembly file where you try to add an offset to it. In this case, the assembler will not know the exact value of 'input' when assembling your instruction, so it would then be up to the linker to figure out the value of the operand during linkage time, or up to the loader to figure it out during program-load time. There are two possibilities: Either these tools (the combination of assembler + linker) have means of handling this, or they do not.

  • If they do have means of handling this, then the object code emitted by the assembler may look slightly different in the '.obj' file, but the resulting bit pattern once your executable has been loaded into memory and starts running should still be exactly the same.

  • If the tools have no means of handling this, your assembler should be giving you an error that it does not know how to add '10' to 'input' because 'input' is an external symbol.

Mike Nakis
  • 56,297
  • 11
  • 110
  • 142
  • 1
    The question is tagged nasm, so the last paragraph doesn't apply. It's always byte offsets. – Jester Mar 21 '20 at 19:22
  • @Jester oh, thanks, I will correct. Since you seem to remember your assembly, you might want to also fact-check the 2nd portion which I just edited into my answer. – Mike Nakis Mar 21 '20 at 19:29
  • I don't remember hearing about any assemblers that scale displacements according to an implicit element size. MASM might require a `dword ptr` override to load from a `var dw ...` into EAX, but the `+10` would still be bytes, not words or dwords. IMO just take that part out, or link an example. – Peter Cordes Mar 21 '20 at 21:16
  • This is a weird explanation. Yes `mov eax, mem` is the same opcode for any addressing mode, but `mem` is a placeholder for *any* addressing mode. So `mem + 10` is nonsense. I'd point out that different addressing modes like `[disp32]` vs. `[reg + disp8]` are encoded by the ModRM and optional SIB byte, which come after the opcode. Except that MOV has a special opcode for `mov eax, absolute_address` that skips the ModRM byte, saving space in 32-bit machine code. (In 64-bit code it's a 64-bit absolute address, so you're better off using `mov eax, [RIP + rel32]` with the normal opcode.) – Peter Cordes Mar 21 '20 at 21:19
  • @PeterCordes okay, I removed that part. – Mike Nakis Mar 21 '20 at 21:34
  • 1
    @PeterCordes I also replaced 'mem' with 'displacement', I hope it is better now. – Mike Nakis Mar 21 '20 at 21:36
  • 1
    There you go, that's much better. I might have mentioned the possibility of other addressing modes like `[reg + disp8]` which would be a different ModRM byte, but that's ok. In your 2nd section: the `+10` is handled at static link time, not load/run time, unless it's a dynamic symbol (only found in a .dll / .so). Yes the relocation entry in the `.obj` will have the symbol name and offset, but we don't run `.obj` / `.o` files. We link them into executables. So really there's 3 cases: static linking, dynamic linking, or a relocation not supported by tools / formats like `[reg - symbol]` – Peter Cordes Mar 21 '20 at 23:28