4

Are there multiple ARM assembly syntax styles as there are for x86? If there are, what support is there for each in different tool chains?

Catalin Vasile
  • 367
  • 5
  • 17
  • 1
    Afaik the intel/at&t two-way-syntax has no pendant. I know of just one ARM syntax/ operand order. I that is, what you meant. – gilgamash Jun 06 '16 at 11:55
  • Pretty much that was my question. – Catalin Vasile Jun 06 '16 at 11:57
  • 1
    But it wasn't only about operand order. For example at&t it's much more verbose and less prone to human error interpretation, but it's harder to read in between lines to understand your code overall (or at least for me). I'm trying to understand ARM assembly, but it's kinda yucky for me because there are so much details for each instruction at once. – Catalin Vasile Jun 06 '16 at 12:12
  • 1
    It wouldn't surprise me if some nutter somewhere has cooked up a version of that ghastly AT&T abomination for the ARM ISA and written an assembler implementing it, it's not like anyone could stop them... whether you'd care or not is another matter. The ISA-level differences that do exist between the legacy ARM/Thumb syntaxes vs. the newer Unified syntax (vs. whatever munging of all of them the GNU _as_ parser accepts :P) are much subtler than you seem to be concerned with here - although it's feasible that some newer assemblers might only support UAL. – Notlikethat Jun 06 '16 at 12:22
  • 1
    You realize the ARM *itself* has multiple syntaxes? (syntices?) They have the old-style syntax, then thumb-syntax, then UAL, finally AArch64 has a new syntax... – EOF Jun 06 '16 at 13:27
  • 1
    That's the thing. I do not know much about ARM, I've read some arch history, which included some vague info about the ISA as well, and now I'm trying to get a hand on it. The ARM ISA is awesome as in what a single instruction can accomplish, but by doing so much it's harder for me to grasp, and start reading assembly code cursively. I am trying to see all the available learning curves there are, before concentrating on one of them. – Catalin Vasile Jun 06 '16 at 13:31
  • I suggest you look at various ARM assembler in open source. No matter what the form, people can generate assembler via perl, python, etc. Or they can use the C pre-processor, etc. Ie, if there were 'n' ARM assembler syntax, then you still wouldn't understand some random code. It would be better to give an example of code you are trying to understand. The GNU `.unified` syntax is probably the most future proof. However, there is not a big difference between any assembler syntax I have seen versus the assembly programmers style. Ie, function/no function, macros, pre-processed, etc. – artless noise Jun 06 '16 at 16:34
  • I think this is a pretty simple question that could use a definitive answer describing unified vs. old-style arm/thumb. And probably also AArch64 syntax. I'm aware that they exist, but I'm not 100% clear on the differences. e.g. I think old-style thumb syntax requires 2-operand instructions, but unified accepts 3-operand syntax even when assembling Thumb2 machine code, as long as the dst is the same as the first src? I went looking for duplicates, and found http://stackoverflow.com/a/25577464/224132, which is not bad, but not definitive. – Peter Cordes Jun 06 '16 at 20:33
  • 2
    Here is a [Ubuntu ARM assembler link](https://wiki.ubuntu.com/ARM/Thumb2PortingHowto#Types_of_Assembly_Language) (which might be outdated and certainly will be at some point) . The question as it is now is not really answerable. If it was just about UAL, then there are already other questions. Thumb2 will accept three arguments; it is meant to be compatible with traditional ARM. If your CPU supports it, there is probably no reason not to use Thumb2 unless you want binary compatibility with an older CPU. – artless noise Jun 06 '16 at 21:33
  • 1
    @PeterCordes ...and then you get into varying degrees of support for pseudo-ops and/or instruction substitution (e.g. `mov` into `mvn`/`movw`/etc); whether the `#` for immediates is mandatory; whether `@` is a comment marker or an alignment specifier; those Apple builds of GAS/LLVM/whatever it was that only accepted NEON datatypes in a non-standard way (on the instruction rather than the operands, IIRC); the menagerie of different directive/label syntaxes; etc, etc... Sure, simple :P – Notlikethat Jun 06 '16 at 23:40

1 Answers1

3

Assembly language is defined by the assembler, the software that reads it. If there exists more than one assembler in the world then almost by definition there exists more than one assembly language. The machine code is obviously defined by the chip/core vendor and for the assembler to be useful it has to produce that. But, the input to the assembler is whatever they want. There are no rules in general. There maybe someone who has created a standard here and there for a target, but in general there is no rule. At best the rule would be whatever my assembler takes in is what I should make my compiler produce, and/or work in parallel and create the assembly language as a communication path between the two.

The chip/core vendor in order to make any sales needs to document their instruction set. As part of that they tend to essentially define an assembly language in the same document. Likewise for that product to succeed, there need to be tools starting with an assembler. So it is in their best interest to either produce that tool themselves or have that tool produced by someone. And ideally have their documentation match.

GNU is well known for hosing up the assembly language defined in the processor vendors documentation and used by the chip/core vendors tools if any. This happened for ARM over 10 years ago, essentially from the beginning of the ARM gas port. And with aarch64 they even are incompatible with themselves. So one would expect there to be at least two different ones, the chip/core vendors tools, and the gnu tools, and then if there are other folks making a set of tools then they either want to be compatible with something out there, or they dont like something out there and specifically want to add something or make it different.

Short answer, yes, it is expected that for every assembler for a target there may be a different assembly language be it somewhere between subtle and severe.

old_timer
  • 69,149
  • 8
  • 89
  • 168