3

Just wondering how the world of assembly works, and I was reading about the assembly language on wiki and this quote struck me:

It implements a symbolic representation of the numeric machine codes and other constants needed to program a particular CPU architecture.

I always thought assembly was a fixed language based on your CPU (with different compilers and languages based on said CPU) so that for your CPU you could only use this type of assembly to talk to your hardware.

But based on that quote, there could be other languages that use other symbols to represent the same numeric machine code.

So, are there any other languages that talk straight to the hardware that aren't assembly? Or am I getting it wrong?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Ólafur Waage
  • 68,817
  • 22
  • 142
  • 198

10 Answers10

8

You could use a different set of symbols to represent the machine codes. But nobody bothers, because you wouldn't gain much.

ARM has an instruction called ADD. In ARM assembler, "ADD r0, r0, #1" represents the 4-bytes of machine code which constitute an instruction to increment register 0.

Whatever you call that instruction, you can't change the set of instructions available and still call it ARM assembler. It's still fundamentally the same programming language whether you call the ADD operation "ADD", or "SUM", or "PLUS", or "ADDITION". Since it's easier to use existing references if everyone uses the same names for everything, that's what happens.

One useful change might be to represent the instruction as "INC r0", since ARM doesn't have an INC instruction, and it's a common operation. This leads to macros in assembler languages. These genuinely do change the language, but once you have macros which emit multiple ARM instructions, you start to lose the WYSIWYG nature of assembly. Eventually you start to think that maybe you might as well just write C. I speak from experience (it wasn't ARM, but it was a macroised assembler).

One common difference is case - if you felt like being pedantic, you could argue that there are two different versions of ARM assembler language, one in uppercase and one in lowercase (or argue that there's one language, with multiple symbols for the same thing). Different disassemblers of the same machine code sometimes output different formats. Sometimes these are different enough that a particular assembler won't cope with all of them, or assemblers will offer their own conveniences which are incompatible with another assembler on the same platform. But really, it's all the same thing, and if you're bothering to draw the distinction, it's generally because you've been bitten in the ass rather than because anything good is happening...

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • 1
    You do get assemblers with slightly different syntax for the same instruction set -- IIRC, Microsoft & Gnu x86 assemblers use different argument order, indirect-addressing syntax, etc. But it's still usually the same mnemonic word for each instruction (ADD, INC, etc)... – Jeff Shannon Apr 03 '09 at 04:46
5

You are getting it wrong (or possibly right - it's difficult to tell from your question). Assembly language is a symbolic (easy for humans to read) representation of the binary patterns of instructions for a particular CP architecture. One does occasionally come across references to "portabe assembler" (Scott Nudds, anyone?) but these are really slightly higher level languages.

  • What I'm thinking is, aren't there other languages that call the same instructions but with a different syntax (on the same CPU) – Ólafur Waage Mar 24 '09 at 11:29
  • 1
    Yes - most CPU architectures actually have several different assemblers which use subtly different symbolic representations of the same machine instructions –  Mar 24 '09 at 11:33
  • Exactly what i was thinking about. Thanks. – Ólafur Waage Mar 24 '09 at 11:36
  • Another thing that "portable assembler" might sometimes mean, is an assembler language for a bytecode which is not a hardware machine code. For instance, jasm is a Java (dis)assembler, which addresses the JVM instruction set directly rather than through the Java Programming Language. – Steve Jessop Mar 24 '09 at 11:57
  • ... So it's a higher-level language in the sense that it doesn't directly address the hardware (unless your hardware has Jazelle, in which case it does), and in the case of Java it also offers some very sophisticated ops. But it needn't - an ARM emulator allows "portable ARM assembly" code... – Steve Jessop Mar 24 '09 at 12:00
  • The name of Scott Nudds shall not be invoked in any online forums. – Jimmy J Mar 24 '09 at 12:54
  • Portable Assembler? Isn't that called "C"? I mean arguably C is a preprocessor for an assembler. Or can be. And it's basically PDP assembler anyway. – Peter Wone Mar 24 '09 at 12:58
  • C is "portable assembler" in the same sense that a modern laser printer is a "typesetting-free printing press" - only in loose metaphor. Yes, you get (almost) direct access to memory and can arbitrarily decide what that memory means, but the grammar allows for abstractions that cannot exist in ASM. – Jeff Shannon Apr 03 '09 at 04:58
3

Here is an example from Clozure Common Lisp. It allows to write inline assembly code in Lisp. The following defines a function %safe-get-ptr written in its x86 assembler notation:

(defx86lapfunction %safe-get-ptr ((src arg_y) (dest arg_z))
  (check-nargs 2)
  (save-simple-frame)
  (macptr-ptr src imm0)
  (leaq (@ (:^ done) (% fn)) (% ra0))
  (movq (% imm0) (@ (% :rcontext) x8664::tcr.safe-ref-address))
  (movq (@ (% imm0)) (% imm0))
  (jmp done)
  (:tra done)
  (recover-fn-from-rip)
  (movq ($ 0) (@ (% :rcontext) x8664::tcr.safe-ref-address))
  (movq (% imm0) (@ x8664::macptr.address (% dest)))
  (restore-simple-frame)
  (single-value-return))

It is still assembly. Besides that there are lots of languages which have low-level constructs to set/read values from memory or registers, etc.

The CPU does not execute assembly language. Assembly language is only some (more or less direct) textual representation of the specific CPU machine code.

Rainer Joswig
  • 136,269
  • 10
  • 221
  • 346
3

Sure, there are lots of languages that talk directly to the hardware that are not assembly. For example, on the Burroughs B5000, the CPU was programmed in a variant of ALGOL, on the Lisp Machine, the CPU executed Lisp code directly, on the early Smalltalk workstations the CPU executed Smalltalk bytecode directly. Researchers have built CPUs based on graph-reduction engines that execute Lambda Calculus directly. There's more than one company that build Java processors, which are of course programmed in JVM bytecode.

Jörg W Mittag
  • 363,080
  • 75
  • 446
  • 653
2

Clarifying some answers regarding the Burroughs B5000 and B6000 series machines, there was no Assembler program, and thus no Assembly-language programming. There was also a complete absence of a linking loader. The single-pass Algol compiler (written by Donald Knuth) generated machine code directly. The hardware reference manual describes the instructions using mnemonics that assembly programmers would recognize, but that's the closest we get to it.

You could ask the Algol compiler to print the generated code inline with the source code during compilation.

The Narrative Description of the Master Control Program provides a good description of the stack architecture and the main instructions.

2

Assembly languages are very closly related to the hardware architecture of the target system.

To a large extent there is a one to one mapping from asm code to machine instruction -- thats the whole point really -- so you can manipulate the hardware at the level of individual instructions.

They also allow you to access and manipulate memory in a manner that matches the machines memory architecture (monolithis, segemnted, virtual etc.).

Assemblers vary greatly some do litle more than translate three letter codes to 4 byte instructions, others, like the venerable OS/390 assembly language are sophisticated programming nevironments in thier own right.

Having said all this most modern chips are emulating ancient instruction sets so you are really not that close to the wire anyway, and, the better C compilers are aware of the underlying micro-architectures (things like pipelines, how many integer instructions ar e executed every cycle etc.) so a good C compiler will nearly always out perform mediocre assembly code!

James Anderson
  • 27,109
  • 7
  • 50
  • 78
1

Yes , it's called FORTH, as long as you view the hardware as virtual ! The machine code for the primitive register operations of the FORTH stack machine is FORTH. But if you emulate this hardware perhaps it counts ? Have a look at http://www.greenarraychips.com/ for the leading edge and the classic from 1984 "Thinking Forth" by Leo Brodie which may help you ... even if you never use Forth.

Andrew
  • 11
  • 1
  • 2
    "as long as you view the hardware as virtual" - is that truly different from modern bytecode languages like .NET bytecode (CIL) or Java .class bytecode? Those are also both stack VMs, just with different purposes than Forth. I don't know Forth well, but my understanding is that it's still an interpreted or compiled language with a lightweight runtime. If you relax the definition of "talk to the hardware" as "access arbitrary I/O or memory addresses" instead of "executed directly by the CPU", then C qualifies. – Peter Cordes Aug 25 '20 at 03:43
1

Assembly intermixed with C is used a lot. Some CPUS (like the 8052 chip) come with a higher level language burned in ROM. These languages have special statements that allow interaction with hardware at a low level.

A family of CPUS are generally designed to use the same machine codes which means the same assembly language. A specific CPU may have more cache, pipelines, etc but otherwise can run the same machine code as the other CPUS in the same family.

So software compiled to one CPU will run on all of them. One of the most popular is the i386 instruction set which found powering nearly all Windows machine. There is a 16 bit predecessor, and a 64 bit successor.

RS Conley
  • 7,196
  • 1
  • 20
  • 37
1

... So that for your CPU you could only use this type of assembly to talk to your hardware.

All languages eventually convert to instructions that are executed on real hardware, whether that is done fairly directly as with an assembler or through a high level of abstraction as with C. The tricky bit is actually getting the machine instructions to manipulate the hardware in ways that you want since one point of higher level languages is to shield you from the hardware details.

Some languages, like C, are designed with the intent to manipulate hardware directly and so they include keywords like volatile to prevent the compiler from otherwise optimizing away references to device registers. These may be written and not read back so that the compiler thinks the value saved is never used again. Or it may be necessary to read a device register though the value is never used. There are also miscellaneous instructions for such operations as enabling and disabling interrupts that an ordinary program will not generate.

This may also require linker support so that memory locations (for memory mapped I/O) can be located at the correct addresses for device registers. However some processors use distinct instructions for I/O and there must be some facility for inserting them in the code stream, so in many cases it may not be possible to access H/W unless there is explicit language support.

And finally, with most modern operating systems like Windows and Linux, applications are run in virtual memory where program addresses do not match physical address and the programs are usually denied access to the hardware. Code that tries to access hardware when the OS has not granted it specific permissions will generate an interrupt, return to the OS and no longer execute.

HankB
  • 332
  • 4
  • 17
1

Your question was:

So, are there any other languages that talk straight to the hardware that aren't assembly? Or am I getting it wrong?

I'm surprised no one's mentioned Register Transfer Language, or any of the hardware description languages, such as Verilog or VHDL.

RTL isn't a programming language per se, and is generally hardware-neutral (assembly is definitely NOT neutral, it's targeted to a specific architecture).

VHDL and Verilog are most commonly used for programmable logic, which I think qualifies as "talking straight to the hardware". Soft cores are often implemented in programmable logic, so you could use one of these to implement (for example) an ARM processor, which itself could be programmed in assembly....

Fun stuff.... makes me wish I could go back & do all my EE/CE work again....

Dan
  • 10,303
  • 5
  • 36
  • 53
  • 1
    If it is hardware-neutral language, than it can't talk directly to the machine. Though I appreaciate you mention RTL here. – Sebastian Mach Mar 24 '09 at 17:38
  • 1
    I wouldn't say Verilog/VHDL "talk straight to the hardware" - they describe logic operations, and go through a multiple-step compilation process (including chip-specific libraries and some hefty nondeterministic simulation) to produce a binary file that gets loaded into the programmable logic chip. – Jeff Shannon Apr 03 '09 at 04:41