3

What data type should be used for a general-purpose integer in C++?

The obvious answer is int, and this made sense in old times where it was commonly 16-bit on 16-bit machines and 32-bit on 32-bit machines. But now 64-bit code is becoming more common, but int is often 32-bit on those compilers. So we can't assume int is necessarily the "fastest" or largest type for that system any more.

Another problem is propagation of 64-bit values from the size of data structures and files. I know you can store these values in a 32-bit int and get away with it, if the size doesn't get too big. But I want to write code which can handle the maximum size of data, if that's what the user wants. I don't want my code to die if the user opens a 5gb file and wants the whole thing in memory, because the size is stored in an int somewhere. 16+gb ram systems will be the norm someday, and I want my code to still work.

I know that there are types such as vector<T>::size_type to store that data. But what if size data can come from several different container and stream types? Should I use size_t for all integers which may store size information?

So I'm forced to conclude I should use the size_t data type (or signed equivalent, I can live with a maximum of 9,223,372,036,854,775,807 bytes per data structure for now), and not int, for general-purpose use, but this is not what I observe in practice, where int is still commonly used.

What integer data type should I use for general-purpose calculations and what are the technical reasons for doing so?

Neil Kirk
  • 21,327
  • 9
  • 53
  • 91
  • une "long long" is 64 bit on every os and is supported fron (if I'm not wrong) C99 – GMG Jul 27 '14 at 19:55
  • 1
    It depends on what you want to do – Paolo Brandoli Jul 27 '14 at 19:57
  • 1
    @GMG, C99 has nothing to do with C++. `long long` has been supported since C++11. – chris Jul 27 '14 at 19:57
  • 1
    possible duplicate of [Should I use cstdint?](http://stackoverflow.com/questions/6144682/should-i-use-cstdint) – Nemo Jul 27 '14 at 20:01
  • @Nemo The question is which type I should use, not whether I should use the equivalent type from cstdint int or the built-in. – Neil Kirk Jul 27 '14 at 20:05
  • @NeilKirk: The answers there also answer your question. There is no "equivalent type from cstdint"; that itself is platform-dependent, which is kind of the whole point. – Nemo Jul 27 '14 at 20:10
  • Is there any evidence that 32 bit arithmetic is any slower than 64 bit arithmetic on a 64 bit architecture? The bit-ness is more about pointer size, is it not? – Joseph Mansfield Jul 27 '14 at 20:21
  • @JosephMansfield I'm most concerned with loss of data in 64-bit to 32-bit values. – Neil Kirk Jul 27 '14 at 20:26
  • If you are really worried about the size of data or rather how long an input might turn out to be. Then you can try long long if "long" is not long enough. Long long can hold an infinite numerical value and even if you multiply two or more long numerical values, it will still be able to handle such a case. – Juniar Jul 27 '14 at 22:36
  • `"Long long can hold an infinite numerical value"` - false – Neil Kirk Jul 27 '14 at 23:19

2 Answers2

2

Depending on a situation, you use different integers. Generally, there are two big classes of integers - these related to the data that your program models (i.e. domain data), and these related to construction of the program itself.

Integers from the program's domain (e.g. user data, the data that your program collects or computes, and so on) should be represented with types providing the specific sizing. In C++ these types are defined in the <cstdint> header. For example, if you need a signed 32-bit type that is portable across all platforms, use int32_t; if you need a 64-bit unsigned number, use uint64_t, and so on.

If you are concerned with execution speed, use integer types with fastN_t suffix, e.g. uint_fast16_t for a fast 16-bit unsigned integer.

Integers related to program construction are created when you take size of data structures or subtract pointers. Use size_t for sizes, and ptrdiff_t for pointer differences.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • `cstdint` is a tighter version of the header for this purpose, I believe. – Nemo Jul 27 '14 at 20:10
  • +1. But you did not mention `int` (or `unsigned`, `long`, etc.) Is your advice never to use them? – Nemo Jul 27 '14 at 20:24
  • 1
    @Nemo I do not use `int` or `long long` in anything facing the user of my APIs to be as specific as possible about the type in a platform-independent way. I use `int`s and `long long`s as local variables in situations when it does not matter - for example, for loop indexes when I know the bounds are small, for counters internal to my program, and so on. – Sergey Kalinichenko Jul 27 '14 at 20:29
  • @dasblinkenlight But instead of using int when you know bounds are small, why not use size_t so it's correct no matter the bounds? – Neil Kirk Jul 27 '14 at 23:22
  • @NeilKirk The problem with `size_t` is that it is unsigned. Therefore, loops that decrement the loop variable (i.e. `for (size_t i = size-1 ; i >= 0 ; i--)` loops) will never stop for no obvious reason. – Sergey Kalinichenko Jul 28 '14 at 01:23
1

int is "general purpose". You should use int unless there is a reason not to. In fact, the very fact that you use anything other than int signals to people reading your code that you are not doing "general purpose calculations". It even says so right in the standard itself,

Plain ints have the natural size suggested by the architecture of the execution environment44 ; the other signed integer types are provided to meet special needs.

If you are indexing into a file that might be very large, you're not doing general purpose calculations anymore. fseek takes an argument of type long, not int. You will use long not because it's the correct "general purpose" integer type, but because you have a "special need".

If you use long or long long for everything then you will confuse people.

Brian Bi
  • 111,498
  • 10
  • 176
  • 312
  • `int` used to have the "natural" size of the architecture but I disagree that is commonly the case on 64-bit systems. `long` is 32-bit in my compiler Visual Studio and I want to open any supported file size. I don't consider opening files a "special need". – Neil Kirk Jul 27 '14 at 20:07
  • @NeilKirk OK, maybe you don't consider processing > 2 GB files a special need. That doesn't change anything. Use whatever type you need to use when processing files. `int` is still the general purpose type. I checked just now, and doing calculations with `int` takes the same amount of time as doing calculations with `long long` on my system, even though it's 64-bit. – Brian Bi Jul 27 '14 at 20:11
  • So that suggests long long should be used instead of int! – Neil Kirk Jul 27 '14 at 20:12
  • 1
    @NeilKirk If you use `long long` then it will be slower than `int` on a 32-bit system. You are still going to have to use it if you want to support > 2 GB files on a 32-bit system. But you really should use `int` whenever you don't need a longer type. – Brian Bi Jul 27 '14 at 20:17