0

Suppose I have the following asm program, exit.s:

.section .text
.globl _start
_start:
    movl $1, %eax
    movl $0, %ebx
    int $0x80

When I run the following command to build the object file:

$ as exit.s -o exit.o

What exactly is this doing, and why is this needed? From the book "Programming from the ground up":

An object file is code that is in the machine's language, but has not been completely put together.

I thought that was the whole point of the assembly language itself? What then is the difference between an assembly program (exit.s) and an object file (exit.o). Couldn't both be read by a computer, for example doing hexdump on the first line:

0000000 2e 73 65 63 74 69 6f 6e 20 2e 74 65 78 74 0a   
000000f

Why couldn't the computer understand that directly?

  • Ahh, but picture if you had multiple asm files. Maybe printing.s, inputting.s, fileopen.s, etc. Then there's main.s that calls the routines from all those other files to do something amazing. At some point, you need to stitch them all together. The "call print" that you use in main.s would need to know the address where the code for print ended up. That's what linkers do. It takes all the object files, figures out where to put everything in memory, replaces symbols with the actual addresses, then writes out the executable. – David Wohlferd Oct 05 '19 at 20:27
  • An object file contains all the necessary instructions but not runnable. Next compilation step is to link an object file with all libraries that a program will need. For example, if you use printf, a linker links statically or dynamically the instructions. Assembly language is composed of "mnemonic" and "operands". Mnemonic is an operation name made by the processor. For example, movl is a mnemonic and tells to processor to move a (immediat) value (1)to eax registry. The problem is a processor doesn't understand what does it mean movl, he only understands a long logically sequence of 0 and 1. – nissim abehcera Oct 05 '19 at 20:35
  • @DavidWohlferd would you want to show an example of that in an answer and I can go ahead and accept it? –  Oct 05 '19 at 20:55
  • @nissimabehcera but why not make a computer that can understand MOVL then? That's the question ;) – m0skit0 Oct 05 '19 at 21:10
  • 1
    Note that it is possible to skip the object file step and have the assembler directly produce an executable. However, this makes it difficult to write libraries and large programs as the assembler has to reassemble the entire library every time you use it, consuming a lot of time and memory. A linker is much faster in this regard. – fuz Oct 05 '19 at 21:53

1 Answers1

0

What exactly is this doing

That is assembling (translating) your assembly (human-readable text) into machine code (machine-readable binary).

why is this needed

Because CPUs cannot execute text, they can only execute machine code.

Why couldn't the computer understand that directly?

First of all, the first lines are directives to the assembler and not instructions. These lines tell the assembler how to assemble, not what to assemble.

Secondly, designing and building a CPU that executes text assembly would definitely be possible but it simply has no advantage at all: harder to design, harder to build, slower to execute programs, programs would probably be 3x, 4x the size of current machine code... As you can see, it's not very efficient this way.

m0skit0
  • 25,268
  • 11
  • 79
  • 127
  • So instead of running `movl $1, %eax` (or the binary representation of that), what would be an example of the code the machine would actually execute? –  Oct 05 '19 at 21:38
  • 1
    You can use `objdump --disassemble-all exit.o` to see the binary for each instruction. – m0skit0 Oct 05 '19 at 21:41
  • 2
    @TagC198 That instruction encodes to `b8 01 00 00 00` where `b8` is the operation code (op code) and `01 00 00 00` is a 4 byte immediate operand. Refer to the Intel manuals for instruction encodings or to [this site](http://ref.x86asm.net/geek32.html) for an overview. – fuz Oct 05 '19 at 21:50
  • 1
    By the way: In the 1960s there were computers where machine code was designed in a way to be human-readable text. For such computers "machine code" and "assembler" was identical. – Martin Rosenau Oct 06 '19 at 20:26