Memory - Natural address boundary

Question

Definition

Structure padding is the process of aligning data members of the structure in accordance with the memory alignment rules specified by the processor.

what is the memory alignment rule for Intel x86 processor?

As per my understanding, natural address boundaries for Intel-x86 processor is 32 bits each(i.e.,addressOffset%4==0)

So, In x86 processor,

struct mystruct_A {
    char a;
    int b;
    char c;

};

will be constructed as,

struct mystruct_A {
    char a;
    char gap_0[3]; /* inserted by compiler: for alignment of b using array */
    int b;
    char c;
    char gap_1[3]; /* for alignment of the whole struct using array */
};

what is the memory alignment rule for Intel x86-64 processor?

As per my understanding, natural address boundaries for Intel x86-64 processor is 64 bits each(i.e.,addressOffset%8==0)

So, In x86-64 processor,

struct mystruct_A {
    char a;
    int b;
    char c;

};

will be constructed as,

struct mystruct_A {
    char a;
    char gap_0[7]; /* inserted by compiler: for alignment of b using array */
    int b;
    char c;
    char gap_1[7]; /* for alignment of the whole struct using array */
};

If the above understanding is correct, then I would like to know why use an array of int for bit operation?

Recommends to use int sized data, as mentioned here, that says, because the most cost efficient access to memory is accessing int sized data.

Question:

Is this memory alignment rule that forces to declare int sized data for bit operations?

each member should be aligned wrt its size, char needs no alignment and a 4 byte int needs to be aligned at 32bit boundary. so ordering the members by their sizes is one way of memory improvement. — perreal, Jan 04 '17 at 10:36
@perreal Did I say the same? `for alignment of b using array`. I mean, `int b` You have described better. — overexchange, Jan 04 '17 at 10:37
Your prerequiste is wrong already. Please provide a reference to the standard where it imposes this restriction on padding. It is typically defined by the platform's ABI. Which includes the OS, not only the CPU. — too honest for this site, Jan 04 '17 at 10:40
*Natural size* is a property of the data type, not the processor. 16-bit data is aligned on 2 bytes boundary, and so on. So your reasoning is wrong. And we may use an array of ints for bit fields because that let us work with a group of bits together. Chars will also do often — Margaret Bloom, Jan 04 '17 at 10:42
@MargaretBloom Bit-fields are only guaranteed by the standard to work with `int`/`unsigned int` and `_Bool`. — too honest for this site, Jan 04 '17 at 10:44
@Olaf Alignment rule of 4 bytes is given [here](https://en.wikipedia.org/wiki/Data_structure_alignment), that says: *For example, when the computer's word size is 4 bytes, the data to be read **should** be at a memory address which is **some multiple of 4**. When this is not the case, e.g. the data starts at address 14 instead of 16, then the computer has to read two or more 4 byte chunks and do some calculation before the requested data has been read, or it may generate an **alignment fault**.*. — overexchange, Jan 04 '17 at 10:45
@Olaf My learning is, *unaligned memory access is slower on architectures that allow it (like x86 and amd64), and is explicitly prohibited on strict alignment architectures like SPARC.* — overexchange, Jan 04 '17 at 10:47
Wikipedia is **not** the C standard! I don't ask what "alignment" means. Your 2nd comment is not necessarily correct. — too honest for this site, Jan 04 '17 at 10:49
@Olaf From past 10 years, I am not capable to understand , when somebody says *platform application binary interface(ABI)*. So am stuck with your first comment — overexchange, Jan 04 '17 at 10:50
@Olaf Good point! but I was talking about bit-fields as a general concept (not the C bit-fields). I used an ambiguous terminology, my bad! :) — Margaret Bloom, Jan 04 '17 at 10:53
@MargaretBloom: for generalised bit-fields, `int` is a very bad choice, because certain shifts are either UB or implementation defined. Correct would be `unsigned int`, better a fixed-width type like `uint32_t` which is guaranteed to have no padding-bits. — too honest for this site, Jan 04 '17 at 11:00

score 1 · Answer 1 · answered Jan 04 '17 at 10:55

1

Addendum: this is valid for x86/-64 bit processors, but also for others. I am blindly assuming you're using those. For others, you should check the respective manuals.

If fasm automatically added fillers into my structs i'd go insane. In general, performance is better when accesses to memory are on a boundary corresponding to the size of the element you want to retrieve. That being said, it's not a definite necessity!

This article here might be worth a look: https://software.intel.com/en-us/articles/coding-for-performance-data-alignment-and-structures

Intel's suggestion for optimal layout is to start with the biggest elements first and going smaller as the structure increases. That way you'll stay aligned properly, as long as the first element is aligned properly. There are no three-byte elements, thus misalignment is out of the question and all the compiler might do is adding bytes at the end, which is the best way to make sure it won't ruin things if you choose to do direct memory accesses instead of using variables.

The safest procedure is to not rely on your compiler, but instead aligning the data properly yourself.

Fun Fact: loops work the same way. Padding NOPs in your code, before the start of a loop, can make a difference.

answered Jan 04 '17 at 10:55

z0rberg's

674
5
10

3

Actually the safest procedure is to rely on the compiler, who is going to be way more consistent than any human in applying the rules regarding alignment. Of course the compiler won't reorder fields in a structure (that can break semantics and ABI), so recent versions of gcc and clang also have warnings for suboptimal struct ordering (they'll tell you that they had to introduce avoidable padding). – Matteo Italia Jan 04 '17 at 11:01
Why `short a1` is shown as 1 byte, `int` as 3 bytes? in your referred [link](https://software.intel.com/en-us/articles/coding-for-performance-data-alignment-and-structures) Did I mis-read the address number? address 12-15 mean address 12, 13, 14, 15 – overexchange Jan 04 '17 at 11:02
I agree that, if you don't know how it works, then you should let someone else handle it for you. In this case the compiler. That being said, it's much better to know what you are doing instead of willingly handing over control of your code to a piece of software. – z0rberg's Jan 04 '17 at 11:03
1

No, some architecture may have very restrictive alignment requirements. You may think that aligning `uint16_t` on 2 bytes won't give any padding but it will on some RISC machines. You are only optimising for one architecture, while the compiler can optimise for any. Learning how to use a compiler is much more difficult than learning how to code in assembly for an architecture. – Margaret Bloom Jan 04 '17 at 11:07
I prefer to use the tools i have and let them do what i want, instead of doing what my tool wants from me. You're right, it's architecture specific, but doesn't change anything about it. Relying on your compiler to do what you could do yourself makes you, in my eyes, a tool of your compiler. – z0rberg's Jan 04 '17 at 11:11
That's a respectable personal view, I once, when younger, was also used to think that. However being able to work in abstract terms, knowing what is specific to you and what is common with others is important. Knowing how to make the compiler emit the best code for every possible architecture is anything but being a tool of the compiler, rather using workarounds that work only on a specific case is losing the battle. I agree with you, that knowing the details under the hood is very important, but limiting ourselves to just one "hood" is foolish. – Margaret Bloom Jan 04 '17 at 11:18
I don't believe in the myth that compilers produce the best code possible. Still, you're absolutely right that anyone who programs for multiple architectures benefits from not having to deal with considerations about specific implementations. I don't see programming for one single plattform as limitation, though. – z0rberg's Jan 04 '17 at 11:29
1

@z0rberg's: "it's much better to know what you are doing instead of willingly handing over control of your code to a piece of software." That's nonsense. If you use a compiler, you *are* already willingly handing control over to a piece of software. And as someone else said, compilers are much better at *consistently* and correctly applying the principles of alignment and at not forgetting something, e.g. forgetting to change the padding when you made a change. – Rudy Velthuis Jan 04 '17 at 11:57
So it's nonsense to know what one is doing and one should instead rely on something/someone else to do it for him, because one should not trust in his own abilities to do it properly and correctly. Okay, thanks for clearing that up. – z0rberg's Jan 04 '17 at 12:12
Don't get me wrong, please. More knowledge is better than less knowledge. Inventing a tool that takes care of things means future users will be told to use the tool so they don't have to take care about things themselves, which leads to people having less knowledge about what's going on, because "the tool does it anyway". This might seem irrelevant when it doesn't matter, but more knowledge actually always matters. – z0rberg's Jan 04 '17 at 12:30
1

I agree, and I'm leaving you alone after this :) Just wanted to show you the full picture. You believe this happens as you keep learning: 1) Use high-level tools -> 2) Use a compiler, badly -> 3) Use an assembler -> 4) Has deep understanding of arch & uArch. Instead what happens is this: 1) Use high-level tools -> 2) Use a compiler, badly -> 3) Use an assembler -> 4) Has deep understanding of arch & uArch -> 5) Use **other** assembly languages -> 6) Has deep understanding of **other** archs & uArchs -> 7) Finally understand the rationale of C standard -> 8) Use a compiler, very goodly. – Margaret Bloom Jan 04 '17 at 16:57
Heh. :) I love assembly. I grew up with it. I never had a thing for C. I dislike it. It's ugly. I rather use freepascal, which is available for a lot of plattforms and mix it with assembly. My favourite asm is ARM, it's just amazing! And I admire your understanding, it's not like I don't know that you're really good. I saw your post for the guy who wanted to access his soundblaster and you really went all in. That was amazing! – z0rberg's Jan 04 '17 at 17:26
Beyond what Margaret said, it turns out understanding what's going on at the low level can actually be very frustrating. You start looking at the compiler's output and thinking, "Wow, I could do better if I wrote that by hand!" Of course, you can't actually write everything by hand, or you'd be horribly unproductive. Assembly is great, but there's a reason we use at least *slightly* higher level languages to get real work done. You cannot afford to think about this complicated stuff all the time. Although if you don't know how things work on low level, you're doomed to write inefficient code. – Cody Gray - on strike Jan 04 '17 at 17:41
So... uhm ... what do i do now with this answer? – z0rberg's Jan 04 '17 at 17:46

Memory - Natural address boundary

1 Answers1