Questions tagged [assembly]

Assembly language questions. Please tag the processor and/or the instruction set you are using, as well as the assembler, a valid set should be like this: ([assembly] [x86] [gnu-assembler] or [att]). Use the [.net-assembly] tag instead for .NET assemblies, [cil] for .NET assembly language, and for Java bytecode, use the tag java-bytecode-asm instead.

Assembly is a family of very low-level programming languages, just above machine code. In assembly, each statement corresponds to a single machine code instruction. These instructions are represented as mnemonics in the given assembly language and are converted into executable machine code by a utility program referred to as an assembler; the conversion process is referred to as assembly, or assembling the code.

Language design

Basic elements

There is a large degree of diversity in the way that assemblers categorize statements and in the nomenclature that they use. In particular, some describe anything other than a machine mnemonic or extended mnemonic as a pseudo-operation (pseudo-op). A typical assembly language consists of three types of instruction statements that are used to define program operations:

  • Opcode mnemonics
  • Data sections
  • Assembly directives

Opcode mnemonics and extended mnemonics

Instructions (statements) in assembly language are generally very simple, unlike those in high-level languages. Generally, a mnemonic is a symbolic name for a single executable machine language instruction (an opcode), and there is at least one opcode mnemonic defined for each machine language instruction. Each instruction typically consists of an operation or opcode plus zero or more operands. Most instructions refer to a single value, or a pair of values. Operands can be immediate (value coded in the instruction itself), registers specified in the instruction or implied, or the addresses of data located elsewhere in storage. This is determined by the underlying processor architecture: the assembler merely reflects how this architecture works. Extended mnemonics are often used to specify a combination of an opcode with a specific operand. For example, the System/360 assemblers use B as an extended mnemonic for BC with a mask of 15 and NOP for BC with a mask of 0.

Extended mnemonics are often used to support specialized uses of instructions, often for purposes not obvious from the instruction name. For example, many CPU's do not have an explicit NOP instruction, but do have instructions that can be used for the purpose. In 8086 CPUs the instruction xchg ax,ax is used for nop, with nop being a pseudo-opcode to encode the instruction xchg ax,ax. Some disassemblers recognize this and will decode the xchg ax,ax instruction as nop. Similarly, IBM assemblers for System/360 and System/370 use the extended mnemonics NOP and NOPR for BC and BCR with zero masks. For the SPARC architecture, these are known as synthetic instructions

Some assemblers also support simple built-in macro-instructions that generate two or more machine instructions. For instance, with some Z80 assemblers the instruction ld hl,bc is recognized to generate ld l,c followed by ld h,b. These are sometimes known as pseudo-opcodes.

Tag use

Use the tag for assembly language programming questions, on any processor. You should also use a tag for your processor or instruction set architecture (, , , , , etc). Consider a tag for your assembler as well (, , , et cetera).

If your question is about inline assembly in C or other programming languages, see . For questions about .NET assemblies, use instead and for .NET's Common Intermediate Language, use . For Java ASM, use the tag .

Resources

Beginner's resources

Assembly language tutorials, guides, and reference material

43242 questions
10
votes
3 answers

SIGSEGV in optimized version of code

My knowledge of the intel instruction set is a bit rusty. Can you tell me why I might be getting a segmentation fault in the optimized version of my function (bonus points if you can tell me why I don't get it in the -O0 build of the code. It's C…
laslowh
  • 8,482
  • 5
  • 34
  • 45
10
votes
2 answers

How do I compile assembly routines for use with a C program (GNU assembler)?

I have a set of assembly function which I want to use in C programs by creating a header file. For instance, if I have asm_functions.s which defines the actual assembly routines and asm_functions.h which has prototypes for the functions as well as…
Mr. Shickadance
  • 5,283
  • 9
  • 45
  • 61
10
votes
2 answers

Outputting integers in assembly on Linux

This needs to be done in pure assembly (ie. no libraries or calls to C). I understand the essence of the problem: one needs to divide the integer by 10, convert the one-digit remainder to ASCII, output that and then repeat the process with the…
David Chouinard
  • 6,466
  • 8
  • 43
  • 61
10
votes
1 answer

Compiler (G++) seems to allocate more memory for instances of classes than it needs

I am learning about how compilers represent C++ programs in assembly. I have a question about something that the compiler does that I can't make sense of. Here is some C++ code: class Class1 { public: int i; char ch; }; int main() { Class1…
Grady S
  • 350
  • 3
  • 14
10
votes
3 answers

Redundant instruction in compiled code

Possible Duplicate: What's the point of LEA EAX, [EAX]? During a disassembly practice, I have observed the following code: test.cpp: #include int main(int argc, char * argv[]) { for (int i = 0; i < 10 ; ++i) { …
MByD
  • 135,866
  • 28
  • 264
  • 277
10
votes
4 answers

Free IDE + assembler + software emulator for x86 (MASM) assembly?

I'm currently trying to get into x86 assembly ( I already have some pre-existing knowledge with x51 assembly) and I'm looking for a simple IDE+assembler+emulator for the assembly output. Can you recommend any?
TravisG
  • 2,373
  • 2
  • 30
  • 47
10
votes
1 answer

Need to exploit buffer overflow. Can't figure out how to uncorrupt the stack after executing exploit code?

Basically the function I am exploiting is this: int getbufn() { char buf[512]; Gets(buf); return 1; } When I run the main program the function executes 5 times and each time the location of buf changes and so does the location of…
michael60612
  • 397
  • 2
  • 10
10
votes
6 answers

AND faster than integer modulo operation?

It is possible to re-express: i % m as: i & (m-1) where, i is an unsigned integer m is a power of 2 My question is: is the AND operation any faster? Don't modern CPUs support integer modulo in hardware in a single instruction? I'm interested…
user48956
  • 14,850
  • 19
  • 93
  • 154
10
votes
3 answers

Why does my Cortex-M4 assembly run slower than predicted?

I'm writing some assembly code for the Cortex-M4, specifically the STM32F407VG found in the STM32F4DISCOVERY kit. The code is extremely performance-sensitive, so I'm looking to squeeze every last cycle out of it. I have benchmarked it (using the DWT…
swineone
  • 2,296
  • 1
  • 18
  • 32
10
votes
4 answers

C++: What are R-Value references on a technical level (ASM)?

Possible Duplicate: What is the difference between r-value references and l-value references? (CodeGen) I was wondering, can anyone explain what R-Value references are on a technical level? By that I mean: What happens on assembler level when…
PuerNoctis
  • 1,364
  • 1
  • 15
  • 34
10
votes
1 answer

SIMD-within-a-register version of min/max

Suppose I have two uint16_t[4] arrays, a and b. Each integer in these arrays is in the range [0, 16383], so bits 14 and 15 aren't set. Then I have some code to find the minimum and maximum among a[i] and b[i] for each i: uint16_t min[4], max[4]; for…
swineone
  • 2,296
  • 1
  • 18
  • 32
10
votes
2 answers

How do declare a memory range as uncacheable using gcc on x86 platform?

Although I have read about movntdqa instructions regarding this but have figured out a clean way to express a memory range uncacheable or read data so as to not pollute the cache. I want to do this from gcc. My main goal is to swap to random…
Kabira K
  • 1,916
  • 2
  • 22
  • 38
10
votes
2 answers

Difference between `bx` and `bp`?

What is the difference between bx and bp in assembly? Example here: mov bx, 1h mov bp, 1h Do they reference to the same memory? Is it the same with ss and sp?
tina nyaa
  • 991
  • 4
  • 13
  • 25
10
votes
3 answers

order for encoding x86 instruction prefix bytes

I know that x86 instructions can have a maximum of 4 bytes of prefixes, e.g Lock, rep, segment overrides etc. Is there any particular order in which they should appear, in case multiple prefixes are used?
pankaj
  • 335
  • 4
  • 15
10
votes
2 answers

objdump and ARM vs Thumb

I'm trying to disassemble an object built for ARM with gcc. Unfortunately, objdump is trying to guess whether the code is ARM and Thumb, and is getting it wrong: it thinks my code is Thumb when it's actually ARM. I see that objdump has an option to…
David Given
  • 13,277
  • 9
  • 76
  • 123