4

I often see (in Linux kernel for example) that unsigned long is used to hold the pointers. I wonder what is the reason for this given that size of a pointer may be larger than integer type (including long).

Is it portable to keep a pointer in unsigned long rather than in uintptr_t in Linux user space applications? (Although I know that uintptr_t guarantees to convert from void * to uintptr integer and back without loss of information)

Thanks.

Mark
  • 6,052
  • 8
  • 61
  • 129
  • 1
    Kernel code and user space code are not the same environments. The use of 'unsigned long' for pointers in kernel code may be a questionable practice with modern compilers, but much of the code was written long ago, and the writer(s) of that code, have a pretty good clue what they are doing. The answer to your question is NO, it's not portable for kernel or user mode code. If you inspect the kernel code, you will find many instances of conditionally compiled source that deals with compiler and target architecture specific issues that you should not have to resort to in most user space code. – jwdonahue Oct 09 '20 at 02:05
  • 1
    See: http://archive.opengroup.org/public/tech/aspen/lp64_wp.htm Look at table in the TECHNICAL CHOICES section. For all memory models, _except_ `LLP64` (which is only used by microsoft compilers [AFAIK]), the size of a `long` is always the same size as a pointer. The only reason MS chose `LLP64` at all was because, in DOS days, an `int` was 16 bits. To get a 32 number, they used `long` [`LONG`]. When they went to 64 bit, to preserve backward compatibility, they used `LLP64`. But, if the H/W arch supports 64 bit (e.g. intel), there's no need for `LLP64`. So, the kernel is just fine – Craig Estey Oct 09 '20 at 02:32
  • 3
    Also, the kernel doesn't use (e.g.) `uint32_t`, but, rather `u32` [and others]. They were writing code long before `stdint.h` existed. And, personally, I prefer `u32` because it's shorter. Anyway, the kernel has ~21M lines of code. They have their own conventions. – Craig Estey Oct 09 '20 at 02:37
  • @CraigEstey, thanks for feedback. Is LP64 memory model also applicable to gcc and Linux? – Mark Oct 09 '20 at 03:04
  • If you read the standards, I think you will find that there is no requirement that long always be large enough to hold a pointer. It just happens that it is for Intel (IMS) and GNU compilers (probably others). The later being the only one that really matters for Linux kernel compilations IIRC. As has been pointed out, the kernel has been evolving for decades and through many iterations of the C standards, but the one constant along the way, has been the GNU compiler, which probably made Linux possible to begin with. – jwdonahue Oct 09 '20 at 03:21
  • @jwdonahue: in any case kernel code is never portable (also on same machine different compiler). `long` has many advantages compared to a own type: e.g. clearer integer promotion, and consistent with various POSIX API; it is simpler in a "free-standing" (as defined in C Standard) environment. Kernel and assemblers do much more operation on pointers (and the value has own meaning). It would be a nightmare: this may be long, or maybe an other type, so lets' implement all cases. In any case type in kernel are not necessary type of API (nor they should follow CPU data) – Giacomo Catenazzi Oct 09 '20 at 06:59
  • Can you add an example from the linux kernel where you see this? – stark Oct 09 '20 at 13:15
  • @stark novice here browsing the kernel code so I might be totally off, but I think an example is here in mm_struct (https://elixir.bootlin.com/linux/latest/source/include/linux/mm_types.h#L478). I believe these are pointers to vm_area_structs in mmap. (please correct me if I'm mistaken) – Kvass Dec 13 '20 at 14:22
  • Those are integers holding the start and end addresses of the code and data sections. They are integers so they work with conventional integer arithmetic instead of being dependent on the pointer type. – stark Dec 13 '20 at 14:51

1 Answers1

3

Is it portable to keep a pointer in unsigned long rather than in uintptr_t in Linux user space applications? (Although I know that uintptr_t guarantees to convert from void * to uintptr integer and back without loss of information)

"Yes" in the sense that it will work on any current port of Linux and quite likely in the future. But why? There is a perfectly good typedef that specifies the intent too: uintptr_t - and it makes your code portable to Win64 too.

uintptr_t was a C99 invention and Linux predates C99 by years. Back then there was no convention for specifying an integer wide enough to hold a pointer - but then, there was no unsigned long long either, except by compiler extension, so unsigned long was all that you could reasonably expect to hold a pointer, if anything did, and so it was. By now, it is rather that for any new architecture that runs Linux need to pick type for long so that it is wide enough for pointer size_t etc.

When 64-bit Windows came, too many things were relying on some representation for unsigned long, and it remained as 32 bit instead of the type required to hold pointer. To my knowledge of all relevant platforms today only on Win64 unsigned long is not wide enough to hold a pointer.