10

I will list exactly what I do not understand, and show you the parts I can not understand as well.

First off,

The .Align Directive

  1. .align integer, pad. The .align directive causes the next data generated to be aligned modulo integer bytes

1.~ ? : What is implied with "causes the next data generated to be aligned modulo integer bytes?" I can surmise that the next data generated is a memory-to-register transfer, no? Modulo would imply the remainder of a division. I do not understand "to be aligned modulo integer bytes".......

What would be a remainder of a simple data declaration, and how would the next data generated being aligned by a remainder be useful? If the next data is aligned modulo, that is saying the next generated data, whatever that means exactly, is the remainder of an integer? That makes absolutely no sense.

What specifically would the .align, say, .align 8 directive issued in x86 for a data byte compiled from a C char, i.e., char CHARACTER = 0; be for? Or specifically coded directly with that directive, not preliminary Assembly code after compiling C? I have debugged in Assembly and noticed that any C/C++ data declarations, like chars, ints, floats, etc. will insert the directive .align 8 to each of them, and add other directives like .bss, .zero, .globl, .text, .Letext0, .Ltext0.

What are all of these directives for, or at least my main asking? I have learned a lot of the main x86 Assembly instructions, but never was introduced or pointed at all of these strange directives. How do they affect the opcodes, and are all of them necessary?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Sinister Clock
  • 169
  • 1
  • 2
  • 4
  • 2
    It just means that the assembler will place the next byte at an address evenly divisible by *integer*, so if e.g. the last byte was placed at `0x0eda`, then ordinarily, the next byte would be placed at `0x0edb`, but with an `.align 8` directive in place, the next byte would be placed at `0x0ec0`, the next address that is evenly divisible by 8 – microtherion Jun 25 '13 at 19:48
  • Note that .align is for anything the assembler outputs, such as machine code, not just for what you in C would call "data" – nos Jun 26 '13 at 11:08

4 Answers4

9

As mentioned in the comments, it means the compiler will add enough padding bytes so the next data lands on an "even" position (divisible by the alignment value). This is important because aligned memory access is much faster than unaligned memory access. (Loading a doubleword from 0x10000 is better than loading a doubleword from 0x10001). It might also be useful in case you are interfacing with other components and need to send/receive structs of data with a given padding/alignment.

faffaffaff
  • 3,429
  • 16
  • 27
  • 2
    *Loading a doubleword from 0x10000 is better than loading a doubleword from 0x10001* A better example would be a cache-line or page split, like `0xffff` is much worse than `0x10000` because that's true on all CPUs. Misalignment within a cache line (or within a 16-byte chunk of a cache line) has literally zero extra cost in a lot of cases on most modern x86 CPUs, assuming normal (cacheable) memory. – Peter Cordes Apr 07 '20 at 20:06
7

First, note that .align it is not a x86 specific concept, but a GNU GAS directive documented here. It can also be used for other architectures. x86 does not specify directives, only instructions.

Now let's play with it to understand it:

a.S

.byte 1
.align 16
sym: .byte 2

Compile and decompile:

as -o a.o a.S
objdump -Sd a.o

Output:

0000000000000000 <a-0x10>:
   0:   01 0f                   add    %ecx,(%rdi)
   2:   1f                      (bad)  
   3:   44 00 00                add    %r8b,(%rax)
   6:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
   d:   00 00 00 

0000000000000010 <sym>:
  10:   02                      .byte 0x2

So sym was moved to byte 16, the first multiple of 16 after the first .byte 1 we've placed, to align it at 16 bytes.

The bytes used to fill between 01 and 02 are trash chosen by GAS (TODO how?)

Not let's try a different input:

.skip 5
.align 4
sym: .byte 2

Gives:

0000000000000000 <sym-0x8>:
   0:   00 00                   add    %al,(%rax)
   2:   00 00                   add    %al,(%rax)
   4:   00 0f                   add    %cl,(%rdi)
   6:   1f                      (bad)  
    ...

0000000000000008 <sym>:
   8:   02                      .byte 0x2

So this time sym was moved to 8, which is the first multiple of 4 that comes after 5.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
  • 2
    I think there are some ISAs where `.align` is a synonym for `.p2align` (power of 2), not `.balign` (byte). On x86 it's `.balign`, like the `align` directives in most other assemblers like MASM and NASM. – Peter Cordes Apr 07 '20 at 20:01
  • would you mind taking Trump out of your username? SO usernames aren't a great place for random political statements, especially ones unrelated to SO management. (And Trump is now a private citizen so there are fewer grounds for arguing the ban should be reversed at this point, only that it shouldn't have happened in the first place several months ago.) – Peter Cordes Mar 13 '21 at 11:57
  • @PeterCordes hi, related meta threads at: https://cirosantilli.com/china-dictatorship/#stack-overflow please report to a mod or create a new thread and ping me – Ciro Santilli OurBigBook.com Mar 13 '21 at 13:46
  • Ah, fair enough, I stand corrected. They are *allowed*. I still wouldn't like to see everyone's username turn into a political statement, even if they were all ones I agreed with, though. (would you? Perhaps you would.) So I maintain it's still in somewhat poor taste and something you personally might want to consider voluntarily changing at this point, if you agree. – Peter Cordes Mar 13 '21 at 13:55
  • 1
    Oh and BTW, [Segfault with RIP-relative addressing on Linux](https://stackoverflow.com/q/9177418) reports that `.align` on x86-64 MacOS/clang is a synonym for `.p2align`, not `.balign`. So it's not even portable between different x86 systems in GAS syntax and should never be used. – Peter Cordes Mar 13 '21 at 13:57
  • @PeterCordes I am a huge supporter of freedom of speech, and that people should be able to say whatever they want on their personal profiles, as long as it is legal in the jurisdiction where Stack Overflows servers are located. Thanks for the MacOS note. – Ciro Santilli OurBigBook.com Mar 13 '21 at 14:03
3

The main reason for the align directive is to speed up execution. If a call or jmp target is at an odd address, it may need extra bus transfers and/or an advance to the exact byte. The same is for data. In the old 80386 manual there were penalties for certain opcodes, when the target was misaligned.

I found it in the manual (from http://css.csail.mit.edu/6.858/2011/readings/i386.pdf‎) on page 24:

Such misaligned data transfers reduce performance by requiring extra memory
cycles. For maximum performance, data structures (including stacks) should
be designed in such a way that, whenever possible, word operands are aligned
at even addresses and doubleword operands are aligned at addresses evenly
divisible by four. Due to instruction prefetching and queuing within the
CPU, there is no requirement for instructions to be aligned on word or
doubleword boundaries. (However, a slight increase in speed results if the
target addresses of control transfers are evenly divisible by four.)
ott--
  • 5,642
  • 4
  • 24
  • 27
0

Modulo refers to the modulo operation in arithmetic, ie the % symbol in c, or the "remainder" in other words.

"modulo n" usually implies that the modulus of the expression by n equals 0. If you want to place an address "modulo 4", that means that (address % 4) == 0, which is true for the following examples: 0,4,8,0xC,0x10, etc.

Hardware restrictions require that some data types by aligned by a large integers. For example, some DMA engines might require modulo 64.

Mark Lakata
  • 19,989
  • 5
  • 106
  • 123