11

Both tools translate assembly instructions directly into machine code, but is it possible to determine which one produces the fastest and cleanest code?

starblue
  • 55,348
  • 14
  • 97
  • 151

6 Answers6

29

When you're writing in assembler, you are precisely describing the instructions to generate so it doesn't depend on the assembler. It depends on you. There's a one-to-one correspondence between the mnemonics you write and actual instructions in machine code.

Mehrdad Afshari
  • 414,610
  • 91
  • 852
  • 789
  • 14
    Technically that's not quite true; for things like addressing modes, there can be more than one way of encoding the same location. There are other isomorphisms that I've forgotten about; an x86 shareware assembler from back in the day used these isomorphisms to "fingerprint" code assembled with it, with a view to catching cheats. – Barry Kelly Aug 15 '09 at 19:35
  • 1
    I understand what you are saying. There can be different machine code instructions with similar effect. Technically, an assembler can provide two different representations for those instructions (`int3` and `int 3`) or use some hints to choose which one to generate. These stuff are usually documented for each assembler. I guess for the purpose of this question, it's reasonable to ignore these issues since theoretically, **they are really different instructions** and the assembler could allow you to choose from one. – Mehrdad Afshari Aug 15 '09 at 19:39
  • 5
    Even in assembler there is some room for optimzing. A lot of architectures allow short modes for adressing, for example using a byte value for relative displacment instead of a long. Since this optimizations change the size of the object code, some displacments may be used in the short form, so a multi pass optimization is required. AFAIK yasm can do this. – Gunther Piez Aug 16 '09 at 00:02
  • The point is, while assemblers can help, most of the time, you can still force the exact instruction you want to generate. – Mehrdad Afshari Aug 16 '09 at 02:59
  • 2
    If I remember correctly, NASM in particular is of the "what you write is what you get" persuasion, and will generate precisely the opcodes you request; if there's an ambiguity anywhere, you will have to disambiguate explicitly. – Pavel Minaev Aug 17 '09 at 22:46
  • It may also matter if you may need to write assembly on other platforms. It's nice to have them be in the same format, and NASM only works for x86/x64. – Christopher Aug 19 '09 at 20:00
  • 3
    @Pinael: NASM lets you specify an "optimisation level" which actually indicates how many passes over the code to make. As more addresses are resolved, long addresses can be replaced with short ones and so forth. – Artelius Nov 08 '09 at 05:38
8

I don't know about these two specific tools, but there are some instructions that can be encoded differently:

  • ADD AX,1 is either 05 01 or 81 c0 01 or fe c0
  • INT 3 is either cc or cd 03
  • New AVX instructions that extend two-byte SSE instructions will either have a 2-byte or 3-byte prefix. All 2-byte prefixes can be encoded as 3-byte prefixes as well.

These are just a few examples off the top of my head of how assemblers can encode the same instruction differently, so the question does in fact make sense.

Nathan Fellman
  • 122,701
  • 101
  • 260
  • 319
  • Most assemblers usually choose the instruction with minimum size and they can provide hints to let you choose specifically which one you'd like to generate. For instance, `cc` and `cd 03` are two distinct instructions from the perspective of CPU. It's completely on behalf of the assembler to choose not to provide a way to generate an instruction. – Mehrdad Afshari Aug 15 '09 at 20:21
  • I agree that a good assembler *should* do that. I'm just pointing out ways in which an assembler can do something different. For instance, `INC AX` encoded as `fe c0` is exactly the same as `ADD AX, 1` encoded as `05 01` as far as functionality and instruction length are concerned, though they are indeed decoded separately in the CPU. – Nathan Fellman Aug 16 '09 at 04:02
  • That's what I was actually looking for with my question. I'm starting to learn x86 assembly (been reading the Intel manual) and want to use Linux tools for that, but was in doubt about which tool to start with. Since it seems no real difference exists between gas and nasm I'll pick up nasm because of the friendlier sintax. Thanks for the help! –  Aug 16 '09 at 19:39
  • 1
    @NathanFellman actually `ADD AX,1` (i.e. `05 01`) and `INC AX` (i.e. `FE C0`) are **very** different instructions: the former affects all arithmetic-related flags, while the latter doesn't affect `CF`. An assembler which exchanges one for another is a bad assembler. I'd be surprised if I wrote `add ax,1; jc carried` and the jump actually appeared to depend on previous state of `CF` instead of the result of addition. – Ruslan Apr 30 '16 at 10:44
  • @Ruslan: you're absolutely right. I wouldn't go so far as to say they're *very* different, but yes, there is this difference that I overlooked. – Nathan Fellman Oct 16 '17 at 06:55
5

As a sidenote on the syntax-matter. You can have GAS work perfectly fine with Intel syntax by putting the following line at the top of your source file:

.intel_syntax noprefix

I am using Intel syntax too for all my assmebly needs. It seems far more natural than the AT&T syntax. And it saves some keystrokes :-).

RoaldFre
  • 185
  • 1
  • 7
4

It is assember... it does not optimize code. It just translates as is. So the fastest and cleanest code is produced by programmer or compiler

Artyom
  • 31,019
  • 21
  • 127
  • 215
  • 1
    Ok got it! Now, the assembler is also a compiler right? Could it save a CPU cycle here and there? that was my doubt... –  Aug 16 '09 at 19:52
  • 1
    assembler is not a compiler. A compiler typically relies on an assembler. And assembler translates assembly into binary op code. – tsturzl Sep 24 '15 at 20:26
2

Obviously nasm because Intel syntax looks much cleaner than AT&T syntax.

Brian
  • 1,810
  • 12
  • 9
  • Yes, the clean "look" counts as a plus too, I think. –  Aug 16 '09 at 19:41
  • 4
    I honestly prefer AT&T syntax. I think it transfers the intent of the code clearer and looks less cluttered. Only too bad there's so little documentation on it out there. – Kasper Aug 18 '09 at 03:02
  • 4
    +1 For recommending Intel syntax over AT&T. Friends don't let friends use AT&T. – Jason Dec 03 '09 at 05:29
  • @Jason Haha! I'm thinking that might just be my new motto. :) – Elliott Dec 10 '12 at 00:36
  • GAS also supports Intel syntax starting from binutils-2.10, which was released in 2000 — not too recently. – Ruslan Apr 30 '16 at 10:54
1

@Brian: that was not the question ...

@cyber98834: Well, an assembler does what every assembler must do : translate every instruction to its opcode .

There's no optimization .

Oh and also, there's not such a thing as a "fastest code" ... Can I ask you a question ? The CPU's speed is static, isn't it ?

So, you can't make a code run faster because you can't change the CPU's speed .

But, you can shrink the code so that the CPU handles less amount of instructions, and so takes less time to run .

I hope you understand what I'm trying to say .

I suggest you to buy ( or to look for some pdf's, but I don't know if that's legal ) Michael Abrash's Graphics Programming Black Book which covers many optimization lessons .

SBouazza
  • 51
  • 4
  • Hi SBouazza! I understand your point. Thanks! I`ve already got Michael Abrash`s Graphics Programming Black Book ready (http://www.gamedev.net/reference/articles/article1698.asp) for later :) –  Aug 16 '09 at 19:44