How C read 4 bytes numbers due to endianess?

Question

If i write the following code in C:

  int n;
  n = 2864434397;
  int i;
  i = &n; //I know there will be a warning, it's ok

due to the little endian convention the variable n, on my stack, will be, for example:

0xffffd12c: 0xdd    
0xffffd12d: 0xcc    
0xffffd12e: 0xbb    
0xffffd12f: 0xaa

then if I look at the value of variable i I saw that i = 0xffffd12c.

This mean that the programm will read the values at 0xffffd12c and the following three addresses in this way:

n == 0xAABBCCDD == [value of 0xffffd12f | value of 0xffffd12e | value of 0xffffd12d | value of 0xffffd12c]

Am I right?

@SouravGhosh I mean I was interested in endianess, it's just an example — Q Stack, Apr 22 '21 at 08:05
@QStack Whatever it is, wrong code is nothing but wrong code, specially those one which can cause UB. — Sourav Ghosh, Apr 22 '21 at 08:05
Processors commonly read as many bytes in parallel as their data bus allows. Most probably your machine will read all bytes in the same memory cycle. Anyway, the different bit lines are "routed" to the corresponding bits of the target in the processor. This principle is true for all endiannesses. — the busybee, Apr 22 '21 at 08:16
@QStack If you're interested in endianess, you should read [On Holy Wars And A Plea For Peace](https://www.rfc-editor.org/ien/ien137.txt) — Emanuel P, Apr 22 '21 at 08:31
It is very important to understand that this is not so much dependent on the C language, it is dependent on the architecture that the code is compiled for. — Cheatah, Apr 22 '21 at 08:35
@Cheatah: The endianness actually depends on the C implementation. Most C implementations use an endianness matching the target processor. But some processors let software select endianness. And a C implementation can be designed to support old software that needs a particular endianness even though it contrasts with the target processor. — Eric Postpischil, Apr 22 '21 at 11:19

score 0 · Answer 1 · answered Apr 22 '21 at 09:25

0

The endianess is not determined by the language, in your case C, but it is determined by the target CPU on which you are running your code. So there could be a difference in both bit and byte endianess wether you are running your code on a ARM microcontroller or a x86 CPU.

For further information look here: https://en.wikipedia.org/wiki/Endianness#Hardware

answered Apr 22 '21 at 09:25

Bananenkönig

547
4
12

The endianness actually depends on the C implementation. Most C implementations use an endianness matching the target processor. But some processors let software select endianness. And a C implementation can be designed to support old software that needs a particular endianness even though it contrasts with the target processor. – Eric Postpischil Apr 22 '21 at 11:20
This may be right for some architectures, but for example the ARM Cortex M architecture has either big- or little-endian fixed in silicone. – Bananenkönig Apr 22 '21 at 11:24
The ARM Cortex M architecture does allow selecting endianness in each processor implemented in the architecture. However, that is not an example of what I said. There are specific processor models that allow **software** to select endianness. The same processor can execute both software using big-endian order and software using little-endian order. Ultimately, the order of bytes in memory is a choice of the C implementation. – Eric Postpischil Apr 22 '21 at 11:49

score 0 · Answer 2 · answered Apr 22 '21 at 11:45

The program in the question does not contain any code to read the values from memory. If i = &n; is accepted by the compiler, it merely sets i to the address of n and does not read any bytes of n. Additionally, 2864434397 overflows an int, so the result of n = 2864434397; is implementation-defined.

To examine the individual bytes in memory, we can use this:

#include <stdio.h>
#include <stdlib.h>


int main(void)
{
    //  Use unsigned int so we can avoid complications from a sign bit.
    unsigned int n = 0xaabbccdd;

    /*  Use a pointer (marked with "*") to hold the address of n.
        Use a pointer to unsigned char so we can address the individual bytes.
    */
    unsigned char *p = (unsigned char *) &n;

    //  Use a loop to iterate through the number of bytes in n.
    for (size_t i = 0; i < sizeof n; ++i)

        //  Print each unsigned char (format hhx) in n.
        printf("Byte %zu is 0x%02hhx.\n", i, p[i]);
}

The bytes in memory may appear in the order AA₁₆, BB₁₆, CC₁₆, DD₁₆, but they may appear in other orders. In the C implementation I am using, the output of the program is:

Byte 0 is 0xdd.
Byte 1 is 0xcc.
Byte 2 is 0xbb.
Byte 3 is 0xaa.

Paragraph 6.2.6.1 2 of the 2018 C standard says the C implementation (mostly the compiler) defines the order in which the bytes of an object such as int are stored:

Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined.

Most C implementations use a byte ordering that matches the computer processor they are targeting. However, there are situations in which this is not the case:

Some processors let software select endianness. (Endianness refers to whether the “big end” of an integer, its high-value bits, or its “little end,” the low-value bits, are stored at the lower byte address in memory.)
A C implementation might be designed to support old software that needs a particular byte order.
The bytes of an object might be partly determined by the processor and partly by the compiler. For example, on a “16-bit” processor that only supports 16-bit arithmetic and 16-bit loads and stores, a compiler might support a 32-bit integer type in software, but using multiple instructions to load it, to store it, and to do arithmetic. In this case, the 32-bit integer could have two 16-bit parts. The order of the bytes in the 16-bit parts could be determined by the processor, but the order of the two parts would be entirely up to the compiler. So the bytes could appear in memory in the order CC₁₆, DD₁₆, AA₁₆, BB₁₆.

How C read 4 bytes numbers due to endianess?

2 Answers2