Questions tagged [assembly]

Assembly language questions. Please tag the processor and/or the instruction set you are using, as well as the assembler, a valid set should be like this: ([assembly] [x86] [gnu-assembler] or [att]). Use the [.net-assembly] tag instead for .NET assemblies, [cil] for .NET assembly language, and for Java bytecode, use the tag java-bytecode-asm instead.

Assembly is a family of very low-level programming languages, just above machine code. In assembly, each statement corresponds to a single machine code instruction. These instructions are represented as mnemonics in the given assembly language and are converted into executable machine code by a utility program referred to as an assembler; the conversion process is referred to as assembly, or assembling the code.

Language design

Basic elements

There is a large degree of diversity in the way that assemblers categorize statements and in the nomenclature that they use. In particular, some describe anything other than a machine mnemonic or extended mnemonic as a pseudo-operation (pseudo-op). A typical assembly language consists of three types of instruction statements that are used to define program operations:

  • Opcode mnemonics
  • Data sections
  • Assembly directives

Opcode mnemonics and extended mnemonics

Instructions (statements) in assembly language are generally very simple, unlike those in high-level languages. Generally, a mnemonic is a symbolic name for a single executable machine language instruction (an opcode), and there is at least one opcode mnemonic defined for each machine language instruction. Each instruction typically consists of an operation or opcode plus zero or more operands. Most instructions refer to a single value, or a pair of values. Operands can be immediate (value coded in the instruction itself), registers specified in the instruction or implied, or the addresses of data located elsewhere in storage. This is determined by the underlying processor architecture: the assembler merely reflects how this architecture works. Extended mnemonics are often used to specify a combination of an opcode with a specific operand. For example, the System/360 assemblers use B as an extended mnemonic for BC with a mask of 15 and NOP for BC with a mask of 0.

Extended mnemonics are often used to support specialized uses of instructions, often for purposes not obvious from the instruction name. For example, many CPU's do not have an explicit NOP instruction, but do have instructions that can be used for the purpose. In 8086 CPUs the instruction xchg ax,ax is used for nop, with nop being a pseudo-opcode to encode the instruction xchg ax,ax. Some disassemblers recognize this and will decode the xchg ax,ax instruction as nop. Similarly, IBM assemblers for System/360 and System/370 use the extended mnemonics NOP and NOPR for BC and BCR with zero masks. For the SPARC architecture, these are known as synthetic instructions

Some assemblers also support simple built-in macro-instructions that generate two or more machine instructions. For instance, with some Z80 assemblers the instruction ld hl,bc is recognized to generate ld l,c followed by ld h,b. These are sometimes known as pseudo-opcodes.

Tag use

Use the tag for assembly language programming questions, on any processor. You should also use a tag for your processor or instruction set architecture (, , , , , etc). Consider a tag for your assembler as well (, , , et cetera).

If your question is about inline assembly in C or other programming languages, see . For questions about .NET assemblies, use instead and for .NET's Common Intermediate Language, use . For Java ASM, use the tag .

Resources

Beginner's resources

Assembly language tutorials, guides, and reference material

43242 questions
10
votes
6 answers

Programming graphics in assembler?

I've developed a running Super Mario Sprite using Visual C++ 6.0 and DirectX. But this isn't very satisfying to me (raping a 3D-Multimedia-framework for displaying a 2D sprite only), so I would like to be able to program an animated sprite using C…
inno
10
votes
1 answer

Does vzeroall zero registers ymm16 to ymm31?

The documentation for vzeroall appears inconsistent. The prose says: The instruction zeros contents of all XMM or YMM registers. The pseudocode below that, however, indicates that in 64-bit mode only registers ymm0 through ymm15 are affected: IF…
BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
10
votes
1 answer

Why does arm-gcc decrement/increment the stack pointer even when the stack is never accessed?

When compiling this program with arm-elf-gcc-4.5 -O3 -march=armv7-a -mthumb -mfpu=neon -mfloat-abi=softfp: #include extern float32x4_t cross(const float32x4_t& v1, const float32x4_t& v2) { float32x4x2_t xxyyzz1(vzipq_f32(v1,…
jcayzac
  • 1,441
  • 1
  • 13
  • 26
10
votes
1 answer

Does Skylake need vzeroupper for turbo clocks to recover after a 512-bit instruction that only reads a ZMM register, writing a k mask?

Writing a ZMM register can leave a Skylake-X (or similar) CPU in a state of reduced max-turbo indefinitely. (SIMD instructions lowering CPU frequency and Dynamically determining where a rogue AVX-512 instruction is executing) Presumably Ice Lake…
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
10
votes
1 answer

MOVSD performance depends on arguments

I just noticed a pieces of my code exhibit different performance when copying memory. A test showed that a memory copying performance degraded if the address of destination buffer is greater than address of source. Sounds ridiculous, but the…
user4859735
  • 103
  • 6
10
votes
2 answers

Who Decides Between I/O Mapped and Memory Mapped I/O (x86)

In x86 architecture we use I/O instructions like IN and OUT for I/O mapped I/O. We use memory instructions like MOV in memory mapped I/O as far as I know. This is all nice but who decides which I/O method will be used? If I want to build my own…
Tom Milberg
  • 333
  • 3
  • 8
10
votes
1 answer

X86: What does `movsxd rdx,edx` instruction mean?

I have been playing with intel mpx and found that it adds certain instructions that I could not understand. For e.g. (in intel format): movsxd rdx,edx I found this, which talks about a similar instruction - MOVSX. From that question, my…
R4444
  • 2,016
  • 2
  • 19
  • 30
10
votes
6 answers

Optimising this C (AVR) code

I have an interrupt handler that just isn't running fast enough for what I want to do. Basically I'm using it to generate sine waves by outputting a value from a look up table to a PORT on an AVR microncontroller but, unfortunately, this isn't…
JimR
  • 2,145
  • 8
  • 28
  • 37
10
votes
1 answer

Why does Java compile to assembly twice?

I compiled a simple Java file to assembly using Java 8 on Mac OS X. This is Test.java: public class Test { static volatile int a = 1; public static void main(String[] args) { a++; } } I output the assembly code using: java…
Dolphin
  • 29,069
  • 61
  • 260
  • 539
10
votes
1 answer

what would be the benefit of moving a register to itself in x86-64

I'm doing a project in x86-64 NASM and came across the instruction: mov rdi, rdi in the output of a compiler my professor wrote. I have searched all over but can't find mention of why this would be needed. Does it affect the flags or is it…
nrmad
  • 422
  • 9
  • 19
10
votes
2 answers

How to use address constants in GCC x86 inline assembly

The GCC toolchain uses AT&T assembler syntax by default, but support for Intel syntax is available via the .intel_syntax directive. Additionally, both AT&T and Intel syntax are available in a prefix and a noprefix version, which differ in whether or…
Christoph
  • 164,997
  • 36
  • 182
  • 240
10
votes
3 answers

How can x86 bsr/bsf have fixed latency, not data dependent? Doesn't it loop over bits like the pseudocode shows?

I am on the hook to analyze some "timing channels" of some x86 binary code. I am posting one question to comprehend the bsf/bsr opcodes. So high-levelly, these two opcodes can be modeled as a "loop", which counts the leading and trailing zeros of a…
lllllllllllll
  • 8,519
  • 9
  • 45
  • 80
10
votes
1 answer

while(i--) optimization by gcc and clang: why don't they use sub / jnc?

Some people write such code when they need a loop without a counter or with a n-1, ..., 0 counter: while (i--) { ... } A specific example: volatile int sink; void countdown_i_used() { unsigned i = 1000; while (i--) { sink = i; //…
l4m2
  • 1,157
  • 5
  • 17
10
votes
1 answer

Understanding how `lw` and `sw` actually work in a MIPS program

I'm having bit of a difficulty understanding what sw and lw do in a MIPS program. My understanding of the topic is that we use lw to transfer data from the memory into the register and vice-versa for sw. But how is this exactly accomplished? Let's…
Ski Mask
  • 351
  • 1
  • 2
  • 14
10
votes
1 answer

Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?

I am disassembling this code on llvm clang Apple LLVM version 8.0.0 (clang-800.0.42.1): int main() { float a=0.151234; float b=0.2; float c=a+b; printf("%f", c); } I compiled with no -O specifications, but I also tried with -O0…
Stefano Borini
  • 138,652
  • 96
  • 297
  • 431
1 2 3
99
100