1

Can I use the Intel syntax with GAS (GNU Assembler) if I assemble on non-Intel systems?

Essentially, I am starting out with assembly programming though I have many years of experience in higher-level languages and I would like to choose an assembler that is suitable for both x86 and ARM. Possibly other architectures as well but this is not a pressing need right now.

So far I have been happy with NASM but I also need ARM so GAS seems a nice candidate. Not having written anything much complex yet I am equally open to AT&T as well as to Intel syntax.

The only thing I am not sure if can I use GAS for cross-architecture programming as well if I pick the Intel syntax?

I realise that naturally instruction sets will be different but I would just like to settle on one tool to cover both the architectures - my thinking is that if with time I need a third one there will be less tooling to learn and maintain if I just do everything in GAS from day one.

Intuitively, I would say that there should not be any obstacles to using Intel syntax instead of AT&T to assemble programs on ARM or other non-Intel architectures, because GAS likely builds AST and then emits architecture-specific code regardless of the frontend's syntax, but unfortunately I do not have access to an ARM system to try it out and, that notwithstanding, I would very much like to confirm it with more knowledgeable people. Thank you.

EDIT: I came up with an analogy that may perhaps make it clearer what I have in mind. Supposing that someone learns to use vim effectively on Linux to work with Python, all that knowledge will transfer easily to Mac for development with Objective-C.

Needless to say, Python and Objective-C are two distinctly different languages but still, the knowledge of tooling (vim, in this example) will be useful.

If one day the person needs to develop Java on Windows, the same will still hold and the programmer will be able to use vim or a derivative on the third system.

Ultimately, the actual differences between the languages, Python vs. Obj-C vs. Java will naturally dwarf any tooling-related issues but when someone is just starting out, the idea of using the same tool for several different needs is quite appealing.

This is the kind of reusability that I am thinking of.

  • 3
    Note that each architecture has its own instruction set and syntax. There is no such thing as “Intel syntax” or “AT&T syntax” on ARM. If you write assembly code, you have to write entirely different code for x86 and ARM, regardless of what assembler you use. – fuz Sep 06 '20 at 14:00
  • 3
    There is no notion of Intel or AT&T syntax for Arm assembly. GAS doesn't use standard notation for x86 assembly for backwards compatibility reasons; for Arm there is nothing analogous, and GAS uses the syntax mandated by the Arm manuals – Atticus Stonestrom Sep 06 '20 at 14:00
  • Thanks @fuz - yes, I am aware of the fact that instructions are different, but I am asking if the same tool can be used. Thanks again. –  Sep 06 '20 at 14:01
  • 3
    Yes, GAS can be used for ARM assembly. In fact, apart from Keil and the assembler shipped with Go, it's pretty much the only ARM assembler in common use. But do note that the ARM assembly syntax is quite different from both Intel and AT&T syntax. – fuz Sep 06 '20 at 14:02
  • 1
    binutils is capable of many architectures, but one at a time, when you compile a set of binary tools you pick the target. llvm on the other hand is or can be the other way where one set of binaries builds for many architectures, but they dont have a target specific assembly language they have one for their bytecode and can now make objects from bytecode. Not sure if I know of another that would be cross architecture, build time or runtime. Not really much value from an assembler perspective as there is little crossover from one to another not enough to be worth it. – old_timer Sep 06 '20 at 15:44
  • 1
    Thanks @AtticusStonestrom and fuz - your comments and the answer from James Greenhalgh made me realise that what I ask about precisely, Intel syntax on ARM, simply does not apply, there is no such thing in existence even if GAS as such can be used for ARM or other architectures. –  Sep 06 '20 at 16:25
  • 1
    @Terry No problem! I think another good analogy alongside the one in your post is to think of Intel/AT&T syntax as being like print or cursive font for writing English letters. They look different, but fundamentally have the same "semantics" (ie they mean exactly the same thing). Neither is really relevant when learning to write Chinese characters, eg Arm assembly. However, you might use the same tool – say, a pen – for writing all three, and the pen in this analogy would be GAS. – Atticus Stonestrom Sep 06 '20 at 16:38
  • 2
    @AtticusStonestrom Makes perfect sense! In a way (and I am half-joking here) if my question were to be taken literally, the real answer would be no, GAS cannot assemble on non-Intel systems if Intel syntax is used :-) It it just that the the reason behind it is not any restriction in GAS itself. –  Sep 06 '20 at 16:47

2 Answers2

4

Some of your intuition is right. Gas certainly does support multiple architectures, and there are core features like assembler directives which will enable you to transfer some working knowledge between architecture ports of Gas. Certain command line concepts will be shared; others will differ by what the architecture port maintainers were thinking at the time.

Other aspects of your question, particularly the question of AT&T syntax versus Intel syntax are not well placed for the question you’re asking. I prefer to think of these as dialects of x86 with the challenge being learning the instruction set. What you’re asking with respect to changing architecture is more fundamental; you’re going to learn a new “language” each time, with Gas directives like .balign acting like the only common punctuation marks between those languages.

Gas does require building a separate version for each architecture you want to target. On any individual system, you are probably operating “natively” as in your want to write assembler for your current machine. That’s not the only way to use Gas, so one way to “trial” it if you don’t have an Arm machine to hand would be to install a cross-assembler (for example, on Ubuntu https://packages.ubuntu.com/focal/binutils-arm-linux-gnueabi ).

James Greenhalgh
  • 2,401
  • 18
  • 17
  • 5
    Note that `.align` specifically (as well as some other directives) actually differ in semantics between architectures and platforms! Refer to the manual for details. – fuz Sep 06 '20 at 14:21
  • 2
    Hah - I picked about as poor an example as possible to showcase common knowledge. Let’s go ahead and upgrade the answer to `.balign` ! – James Greenhalgh Sep 06 '20 at 14:25
  • 1
    @JamesGreenhalgh - I made an edit at the end of the question to better express myself but I believe we both have the same in mind. The notion that Intel and AT&T are to be considered merely dialects that apply to x86 only is an intriguing one and it let me think more thoroughly about the subject. I also suspect that things will become much clearer when I actually get around to working with ARM in which case I will be sure to leave a comment here, whenever that happens. –  Sep 06 '20 at 16:07
  • 2
    @Terry Note that there are indeed assemblers that use the same syntax on all platforms (the Plan 9/Go assembler being the only one I am aware of). It's however a rare choice. Usually, you have different syntax for each platform. – fuz Sep 06 '20 at 17:06
  • 2
    The word dialect is in my mind as it is used in the GCC manual to describe the `-masm=dialect` option to the compiler (where dialect can be `intel` or `att`). Something like https://gcc.godbolt.org/z/nzvYWr might help you to visualize the difference between the two "dialects" of x86 and Arm. – James Greenhalgh Sep 06 '20 at 18:23
  • 1
    Clang by default compiles in support for several back-ends, so for example `clang -c -target mips foo.s` works on my x86-64 GNU/Linux desktop with MIPS asm source, or `-target aarch64` for AArch64 asm source, etc. (`llvm-objdump -d` to check the results.) More convenient than installing cross-binutils / cross-gcc packages for every ISA you want to be able to randomly play with a little bit. – Peter Cordes Sep 06 '20 at 18:44
1

I would like to choose an assembler that is suitable for both x86 and ARM

What exactly do you want to do?

  1. Writing programs on an ARM computer (in x86 syntax!) that shall later run on an x86 PC or writing programs on an x86 PC (in ARM syntax!) that shall later run on an ARM CPU?
  2. Writing assembly programs that shall run both on an ARM and on an x86 PC?

If the answer is 1.:

Many CPUs in smaller devices (for example WLAN routers or smartpones) are ARM CPUs. However, you want to develop programs for such devices on your PC, which has an x86 CPU.

What you do is using a GAS version with the "target" ARM and the "host" x86. This means that GAS is running on an x86 CPU but generates code for an ARM CPU.

However, your "source code" (assembly program) must be an ARM assembly program.

As far as I know, GAS supports only one syntax variant for ARM CPUs; there is nothing like the "AT&T" syntax for ARM.

If you have an ARM computer and you want to write x86 programs on it, you can of course use a GAS version with the "target" x86 and the "host" ARM. If the "target" is x86, GAS supports "AT&T" and "Intel" syntax independently of the "host".

If the answer is 2.:

This won't work!

In assembly language, one assembly instruction typically represents one instruction of the CPU. And different CPU architectures have completely different instructions and therefore completely different assembly code.

Here an example program for x86 and for ARM:

Intel CPU, Intel syntax         ARM (non-Thumb) CPU
-----------------------         -------------------
mov eax, 4
mov ebx, 1                      ldr r0, =1
mov ecx, offset myText          ldr r1, =myText
mov edx, 6                      ldr r2, =6
int 0x80                        svc #0x900004

shr edi, 6
add eax, edi                    add r0, r0, r6, lsr #6

                                ldr r5, =someVariable
add [someVariable], eax         ldr r7, [r5]
                                add r7, r0
                                str r7, [r5]

You can see that there is no 1:1 relation for the instructions:

The instruction add [someVariable], eax on the x86 requires 4 instructions on the ARM; the instruction add r0, r0, r6, lsr #6 on the ARM requires 2 instructions on the x86.

Martin Rosenau
  • 17,897
  • 3
  • 19
  • 38
  • 1
    Thanks for the answer but yes, I was only thinking about the first case. I was somehow under a mistaken belief that Intel syntax applies to Intel-based CPUs only whereas AT&T could be used on Intel and elsewhere too. But I now understand my own confusion. –  Sep 07 '20 at 09:19