fixed length data types in C/C++

Question

I've heard that size of data types such as int may vary across platforms.

My first question is: can someone bring some example, what goes wrong, when program assumes an int is 4 bytes, but on a different platform it is say 2 bytes?

Another question I had is related. I know people solve this issue with some typedefs, like you have variables like u8,u16,u32 - which are guaranteed to be 8bits, 16bits, 32bits, regardless of the platform -- my question is, how is this achieved usually? (I am not referring to types from stdint library - I am curious manually, how can one enforce that some type is always say 32 bits regardless of the platform??)

There are potential issues with overwriting memory. If you are assuming that an integer is 4 bytes when it's 2 bytes on another platform, depending on how the memory is laid out, you could overwrite the next 2 bytes after your integer. — Austin Brunkhorst, Nov 27 '13 at 08:45
a good time to read the (old, but still hugely informative) C FAQ from Usenet days : http://www.faqs.org/faqs/C-faq/abridged/ and then http://www.faqs.org/faqs/C-faq/faq/ (unabridged, so if you can read that one instead! many more infos). It talks about many of those, and also about many other often-wrong assumptions (internal representation of NULL, etc). (A must read is the chapter about null, and about pointers/arrays. The rest is GOOD too, and eye-opening on many subjects) — Olivier Dulac, Nov 27 '13 at 13:48
Please note that the byte ordering might also vary from platform to platform. (+1 for question - it's better to ask question then assume "surely the `sizeof(void *)` will always be `4`). — Maciej Piechotka, Nov 27 '13 at 14:48
@MaciejPiechotka: agreed. and it's good to post those, as many reader could then become aware of the potential pitfall and of its solutions! There are no bad questions [well, if they give enough context], just bad answers ^^ — Olivier Dulac, Nov 27 '13 at 17:27

score 41 · Answer 1 · edited Nov 28 '13 at 07:43

I know people solve this issue with some typedefs, like you have variables like u8,u16,u32 - which are guaranteed to be 8bits, 16bits, 32bits, regardless of the platform

There are some platforms, which have no types of certain size (like for example TI's 28xxx, where size of char is 16 bits). In such cases, it is not possible to have an 8-bit type (unless you really want it, but that may introduce performance hit).

how is this achieved usually?

Usually with typedefs. c99 (and c++11) have these typedefs in a header. So, just use them.

can someone bring some example, what goes wrong, when program assumes an int is 4 bytes, but on a different platform it is say 2 bytes?

The best example is a communication between systems with different type size. Sending array of ints from one to another platform, where sizeof(int) is different on two, one has to take extreme care.

Also, saving array of ints in a binary file on 32-bit platform, and reinterpreting it on a 64-bit platform.

+1 for _saving array of ints in a binary file on 32-bit platform, and reinterpreting it on a 64-bit platform._. — legends2k, Nov 27 '13 at 09:27

paxdiablo · Answer 2 · 2013-11-27T08:48:09.190

22

In earlier iterations of the C standard, you generally made your own typedef statements to ensure you got a (for example) 16-bit type, based on #define strings passed into the compiler for example:

gcc -DINT16_IS_LONG ...

Nowadays (C99 and above), there are specific types such as uint16_t, the exactly 16-bit wide unsigned integer.

Provided you include stdint.h, you get exact bit width types,at-least-that-width types, fastest types with a given minimum widthand so on, as documented in C99 7.18 Integer types <stdint.h>. If an implementation has compatible types, they are required to provide these.

Also very useful is inttypes.h which adds some other neat features for format conversion of these new types (printf and scanf format strings).

edited Nov 27 '13 at 08:48

answered Nov 27 '13 at 08:42

paxdiablo

854,327
234
1,573
1,953

1

Sub question: If the platform does not support a 16 bit integer type is `unint16_t` not defined in `cstdint` etc..? Or does the standard guarantee that type will always be there (and do stuff internally to make sure it works)? – Martin York Nov 27 '13 at 09:18
5

No, the C standard only _requires_ it if the implementation has a compatible type. If you're running on a 12-bit DSP for example, then it doesn't _have_ to provide a 16-bit uint16_t. It _may_ but it's not mandatory: `7.18.1.1/3: These types are optional. However, if an implementation provides integer types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed types) that have a two’s complement representation, it shall define the corresponding typedef names.` – paxdiablo Nov 27 '13 at 09:20
4

So if you use `uint16_t` and the platform does not support it then we can expect a compilation error during the porting processes. – Martin York Nov 27 '13 at 09:22
1

@Loki, yes, the compiler won't know the type. – paxdiablo Nov 27 '13 at 09:24

score 16 · Answer 3 · edited May 08 '17 at 07:46

16

For the first question: Integer Overflow.

For the second question: for example, to typedef an unsigned 32 bits integer, on a platform where int is 4 bytes, use:

 typedef unsigned int u32;

On a platform where int is 2 bytes while long is 4 bytes:

typedef unsigned long u32;

In this way, you only need to modify one header file to make the types cross-platform.

If there are some platform-specific macros, this can be achieved without modifying manually:

#if defined(PLAT1)
typedef unsigned int u32;
#elif defined(PLAT2)
typedef unsigned long u32;
#endif

If C99 stdint.h is supported, it's preferred.

edited May 08 '17 at 07:46

Jürgen Schwietering

430
8
14

answered Nov 27 '13 at 08:46

Yu Hao

119,891
44
235
294

Never mind, there are such times ... - have a break! – alk Nov 27 '13 at 08:59
What is a platform here? Is it the hardware - like x86, x86_64, AMD etc... or is it an operating system - like Solaris, AIX, HP-UX, Linux, macOS, BSD, and IBM z/OS etc... ? – Darshan L Oct 14 '20 at 12:42

stefan · Answer 4 · 2014-06-15T15:32:24.337

8

First of all: Never write programs that rely on the width of types like short, int, unsigned int,....

Basically: "never rely on the width, if it isn't guaranteed by the standard".

If you want to be truly platform independent and store e.g. the value 33000 as a signed integer, you can't just assume that an int will hold it. An int has at least the range -32767 to 32767 or -32768 to 32767 (depending on ones/twos complement). That's just not enough, even though it usually is 32bits and therefore capable of storing 33000. For this value you definitively need a >16bit type, hence you simply choose int32_t or int64_t. If this type doesn't exist, the compiler will tell you the error, but it won't be a silent mistake.

Second: C++11 provides a standard header for fixed width integer types. None of these are guaranteed to exist on your platform, but when they exists, they are guaranteed to be of the exact width. See this article on cppreference.com for a reference. The types are named in the format int[n]_t and uint[n]_t where n is 8, 16, 32 or 64. You'll need to include the header <cstdint>. The C header is of course <stdint.h>.

edited Jun 15 '14 at 15:32

answered Nov 27 '13 at 08:44

stefan

10,215
4
49
90

2

OP: "_I am not referring to types from stdint library - I am curious manually, how can one enforce that some type is always say 32 bits regardless of the platform??)_"; – legends2k Nov 27 '13 at 08:51
2

@legends2k The correct way to have fixed width integer types _is_ using standard libraries. – stefan Nov 27 '13 at 08:52
4

Agreed, but that's when you write code, not when you try to learn how such headers are written in the first place. – legends2k Nov 27 '13 at 08:53
@legends2k Well that's just platform dependent. If one is interested in that: Just open the header ;-) – stefan Nov 27 '13 at 08:54
7

"*First of all: Never write programs that rely on the width of types.*" so you're saying we should not rely on `uint32_t` being 32 bits wide? Abstractions are nice and all but eventually there comes a point where you need to make some assumptions to actually get anything done. – Thomas Nov 27 '13 at 09:02
6

What do you mean, 'never write programs that rely on the width of types'? The width of types directly affects the range of possible values, and that is very important when choosing what types to use, especially for the kind of programming tasks that many people use C/C++ for. If you're writing a filesystem or anything that needs to store a lot of values in constrained memory, you need to make those kinds of decisions. There is a reason strings aren't stored as arrays of unsigned long long. – tfinniga Nov 27 '13 at 09:04
1

-1, it's impossible not to rely on the width of every type you use. It is necessary to be aware what the numeric limits are and request sufficient ranges. If you are worried about the existence of exact-width types, use `uintmin[n]_t` and `uintfast[n]_t` types. – Potatoswatter Nov 27 '13 at 09:05
@Thomas, tfinniga and Potatoswatter: Sorry for beeing unclear about that. I meant to write "don't rely on width of `short`, `int`,... but falsely didn't specify. That's corrected now. – stefan Nov 27 '13 at 09:48
@legends2k: yes exactly as you mention I was interested mainly to see how such headers are written in the first place. I think out of the responses I get the picture. One could use `#ifdef` to see which platform code is being run, and then based on that, he/she can use for example `typedef int u32` or `typedef short u32` for example, right? but this means in advance you must know on which platform which type has what size – Nov 27 '13 at 10:08
@dmcr_code: The example you stated is correct. Of course, you should know what you're targetting. Say tomorrow you are porting your code to some exotic platform which has char as 32-bits, you'll modify the header with a new typedef. These things help you write portable code, but not _absolute_ portable code i.e. any code will have some porting work involved when you move to a new platform, these things will minimize the effort. – legends2k Nov 27 '13 at 11:16
Think of realtime DSP! You really want to fit the fixed point arithmetics of yours to the native bit widths the processor/MCU can handle. Otherwise you will need additional clocks to complete an arithmetic operation, and that results in poorer performance, and ultimately a failing design, e.g. in closed-loop control applications. – TFuto Nov 27 '13 at 13:24

score 6 · Answer 5 · answered Nov 27 '13 at 08:50

usually, the issue happens when you max out the number or when you're serializing. A less common scenario happens when someone makes an explicit size assumption.

In the first scenario:

int x = 32000;
int y = 32000;
int z = x+y;        // can cause overflow for 2 bytes, but not 4

In the second scenario,

struct header {
int magic;
int w;
int h;
};

then one goes to fwrite:

header h;
// fill in h
fwrite(&h, sizeof(h), 1, fp);

// this is all fine and good until one freads from an architecture with a different int size

In the third scenario:

int* x = new int[100];
char* buff = (char*)x;


// now try to change the 3rd element of x via buff assuming int size of 2
*((int*)(buff+2*2)) = 100;

// (of course, it's easy to fix this with sizeof(int))

If you're using a relatively new compiler, I would use uint8_t, int8_t, etc. in order to be assure of the type size.

In older compilers, typedef is usually defined on a per platform basis. For example, one may do:

 #ifdef _WIN32
      typedef unsigned char uint8_t;
      typedef unsigned short uint16_t;
      // and so on...
 #endif

In this way, there would be a header per platform that defines specifics of that platform.

+1 for being the first to mention structs. You should also be aware of what happens when you send a strct over the network. — James Anderson, Nov 27 '13 at 10:28

chwarr · Answer 6 · 2013-11-27T09:39:35.727

5

I am curious manually, how can one enforce that some type is always say 32 bits regardless of the platform??

If you want your (modern) C++ program's compilation to fail if a given type is not the width you expect, add a static_assert somewhere. I'd add this around where the assumptions about the type's width are being made.

static_assert(sizeof(int) == 4, "Expected int to be four chars wide but it was not.");

chars on most commonly used platforms are 8 bits large, but not all platforms work this way.

edited Nov 27 '13 at 09:39

answered Nov 27 '13 at 08:46

chwarr

6,777
1
30
57

3

`sizeof` actually returns size in `char`s, not bytes. So if you want to check size in *bits*, you should do `sizeof(int) * CHAR_BIT == 32`. – user694733 Nov 27 '13 at 09:36
static_assert is only available in the latest standard. But uint_32t, and similar types are available from before – Sam Nov 27 '13 at 09:37
@user694733 No. Size in chars = size in bytes, by definition. `sizeof(char)==1` – always. – Konrad Rudolph Nov 27 '13 at 13:47
@sammy Nope, `uint32_t` etc. were added at the same time as `static_assert`. – Konrad Rudolph Nov 27 '13 at 13:48
@KonradRudolph It depends on the definition of the byte. Byte is usually considered to be 8 bits. `char` always has `CHAR_BIT` bits. `CHAR_BIT` is at least 8, but may be more. – user694733 Nov 27 '13 at 14:51
@user694733 No, it doesn’t. The standard says, literally, that “`sizeof(char)` … [is] 1”, and “The `sizeof` operator yields the number of bytes in the object representation of its operand” (§5.3.3). This is all that matters. The value of `CHAR_BIT` is irrelevant in this. If `CHAR_BIT==16`, that simply means that on this machine, a byte has 16 bits. – Konrad Rudolph Nov 27 '13 at 14:55
@KonradRudolph I am not denying that. But if you must know how many **bits** type has, you have to take `CHAR_BIT` into consideration too, because `sizeof(type)` returns how many `char`s the `type` is. – user694733 Nov 27 '13 at 14:58
@user694733 That’s a fair point. I actually just replied to the first part of your first comment. – Konrad Rudolph Nov 27 '13 at 15:02

score 3 · Answer 7 · answered Nov 27 '13 at 08:48

Well, first example - something like this:

int a = 45000; // both a and b 
int b = 40000; // does not fit in 2 bytes.
int c = a + b; // overflows on 16bits, but not on 32bits

If you look into cstdint header, you will find how all fixed size types (int8_t, uint8_t, etc.) are defined - and only thing differs between different architectures is this header file. So, on one architecture int16_tcould be:

 typedef int int16_t;

and on another:

 typedef short int16_t;

Also, there are other types, which may be useful, like: int_least16_t

score 2 · Answer 8 · answered Nov 27 '13 at 08:44

If a type is smaller than you think then it may not be able to store a value you need to store in it.
To create a fixed size types you read the documentation for platforms to be supported and then define typedefs based on #ifdef for the specific platforms.

legends2k · Answer 9 · 2013-11-27T09:30:11.033

can someone bring some example, what goes wrong, when program assumes an int is 4 bytes, but on a different platform it is say 2 bytes?

Say you've designed your program to read 100,000 inputs, and you're counting it using an unsigned int assuming a size of 32 bits (32-bit unsigned ints can count till 4,294,967,295). If you compile the code on a platform (or compiler) with 16-bit integers (16-bit unsigned ints can count only till 65,535) the value will wrap-around past 65535 due to the capacity and denote a wrong count.

score 1 · Answer 10 · answered Nov 27 '13 at 08:52

Compilers are responsible to obey the standard. When you include <cstdint> or <stdint.h> they shall provide types according to standard size.

Compilers know they're compiling the code for what platform, then they can generate some internal macros or magics to build the suitable type. For example, a compiler on a 32-bit machine generates __32BIT__ macro, and previously it has these lines in the stdint header file:

#ifdef __32BIT__
typedef __int32_internal__ int32_t;
typedef __int64_internal__ int64_t;
...
#endif

and you can use it.

score 0 · Answer 11 · answered Nov 27 '13 at 13:22

0

bit flags are the trivial example. 0x10000 will cause you problems, you can't mask with it or check if a bit is set in that 17th position if everything is being truncated or smashed to fit into 16-bits.

answered Nov 27 '13 at 13:22

jheriko

3,043
1
21
28

fixed length data types in C/C++

11 Answers11

Linked

Related