13

The problem I have is to create a sort of big integer library. I want to make it both cross platform and as fast as possible. This means that I should try to do math with as large data types as are natively supported on the system.

I don't actually want to know whether I am compiling for a 32 bit or 64 bit system; all I need is a way to create a 64 bit or 32 bit or whatever bit integer based on what is the largest available. I will be using sizeof to behave differently depending on what that is.

Here are some possible solutions and their problems:

Use sizeof(void*): This gives the size of a pointer to memory. It is possible (though unlikely) that a system may have larger pointers to memory than it is capable of doing math with or vice versa.

Always use long: While it is true that on several platforms long integers are either 4 bytes or 8 bytes depending on the architecture (my system is one such example), some compilers implement long integers as 4 bytes even on 64 bit systems.

Always use long long: On many 32 bit systems, this is a 64 bit integer, which may not be as efficient (though probably more efficient than whatever code I may be writing). The real problem with this is that it may not be supported at all on some architectures (such as the one powering my mp3 player).

To emphasize, my code does not care what the actual size of the integer is once it has been chosen (it relies on sizeof() for anything where the size matters). I just want it to choose the type of integer that will cause my code to be most efficient.

Talia
  • 1,400
  • 2
  • 10
  • 33
  • The first thing that comes to my mind is to figure out the size of CPU's registers in some way... – BlackBear Dec 28 '10 at 23:56
  • have you searched for answer in existing bignum libraries? http://sourceforge.net/projects/bignlibacbignum/ – mth Dec 28 '10 at 23:58
  • Perhaps trying to move something in them then add something and check the overflow flag – BlackBear Dec 28 '10 at 23:59
  • 2
    Before doing such a problematic assignment, have you looked at GMP? because it sounds as you are trying to do just that. In terms of performance, I doubt you can even come near GMPs implementation since it's performance is well known. – Milan Dec 29 '10 at 00:00
  • @Milan Well, when you consider GMP has assembly-optimised bignums and is known to be one of the fastest bignum libraries on the planet, you have a fair point. However, sometimes these things need to be done just to learn. –  Dec 29 '10 at 00:14
  • GMP is unusable as a library because it calls `abort()` without the caller's permission. – R.. GitHub STOP HELPING ICE Dec 29 '10 at 02:44
  • similar question: http://stackoverflow.com/questions/2036172/word-and-double-word-integers-in-c – ergosys Dec 29 '10 at 04:31
  • Just as a historical note, back in the days of 16 bit DOS, C compilers had memory-model settings. The memory models changed the size of pointers. While it is doubtful you'll be writing code for any of these beasts, it is worth keeping in mine when discussing very low-level details. (Heck, there have even been machines to run C with **byte** being **5** or **9** bits.) – Jeremy J Starcher Sep 22 '12 at 04:56

4 Answers4

8

If you really want a native-sized type, I would use size_t, ptrdiff_t, or intptr_t and uintptr_t. On any non-pathological system, these are all going to be the native word size.

On the other hand, there are certainly benefits in terms of simplicity to always working with a fixed size, in which case I would just use int32_t or uint32_t. The reason I say it's simpler is that you often end up needing to know things like "the largest power of 10 that fits in the type" (for decimal conversion) and other constants that cannot be easily expressed as constant expressions in terms of the type you've used. If you just pick a fixed number of bits, you can also fix the convenient constants (like 1000000000 in my example). Of course by doing it this way, you do sacrifice some performance on higher-end systems. You could take the opposite approach and use a larger fixed size (64 bits), which would be optimal on higher-end systems, and assume that the compiler's code for 64-bit arithmetic on 32-bit machines will be at least as fast as your bignum code handling 2 32-bit words, in which case it's still optimal.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • Thanks, the first part of your answer is pretty much what I am looking for. – Talia Dec 30 '10 at 02:02
  • On the other hand, I think I will probably be using GMP anyway. – Talia Dec 30 '10 at 02:03
  • 1
    Warning: `ptrdiff_t` is guaranteed to be large enough for pointer subtraction, but no one said it can't be larger (although you don't often see a 64-bit `ptrdiff_t` when `size_t` is 32-bit, even though that means that a difference across more than half the address space can't be stored in a `ptrdiff_t`). Be careful what standards you rely on. – user541686 Jan 01 '11 at 18:37
  • I said non-pathological. Surely you can intentionally make types ridiculous extra-large sizes on an implementation, but I don't think anyone would choose to use such an implementation unless they had no choice. Taking the difference of pointers except within the same object is undefined behavior, so as long as a 32-bit implementation doesn't allow single objects greater than 2gb, 32-bit `ptrdiff_t` is entirely correct. – R.. GitHub STOP HELPING ICE Jan 02 '11 at 03:07
  • 1
    An exception could be 16-bit embedded systems where the native arithmetic size is 16 bits, but `size_t` is possibly 32-bit. Or 8-bit systems where the native arithmetic size is 8 bits, but `int` still has to be minimum 16 bits, and `size_t` is probably also 16-bit. – Craig McQueen Sep 01 '13 at 00:27
  • What about x32 mode ? And OpenVMS for example ? – Cyan Apr 30 '16 at 07:39
4

The best way is not to rely on automatic detection, but to target specific compilers with a set of #if/#else statements to choose a type that you have tested and know to be optimal.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
0

Here's how we did it in bsdnt:

#if ULONG_MAX == 4294967295U

typedef uint32_t word_t;
typedef unsigned int dword_t __attribute__((mode(DI)));
#define WORD_BITS 32

#else

typedef uint64_t word_t;
typedef unsigned int dword_t __attribute__((mode(TI)));
#define WORD_BITS 64

#endif

If it's of interest, the guy who initiated the project has written a blog on writing bignum libraries.

GMP/MPIR is massively more complicated; gmp-h.in becomes gmp.h post-configure which defines this:

#define GMP_LIMB_BITS                      @GMP_LIMB_BITS@

In short, the length is set as part of the build process which works it out via config.guess (i.e. autotools).

0

Using int_fast32_t from stdint.h would seem to be an option, although you are at the mercy of Those Who Decide, as to what "fast" means.

ergosys
  • 47,835
  • 5
  • 49
  • 70
  • 1
    If there is no penalty for 32-bit access, as on x86_64, `int_fast32_t` should be a 32-bit type, but performing bignum arithmetic in 64-bit units would certainly be preferable... – R.. GitHub STOP HELPING ICE Dec 29 '10 at 02:48
  • 1
    You may be right in general, but on my x86_64 machine (Ubuntu 10.10), it's 64 bit. I would expect it correlate to the width of the machine's main data registers, but I have no hard evidence of that. – ergosys Dec 29 '10 at 03:43