2

In C, can one stuff a -1 value (e.g. 0xFFFFFFFF) into a pointer, using an approach such as this one, and expect that such memory address is never allocated at runtime?

The idea is that the pointer value be used as a memory address, except if it has this "special" -1 value. The pointer should be considered memory address even if it is NULL (in which case, the object to which it points to has not yet been built).

I understand this may be platform dependent, but the program in question is expected to run in Linux, Windows and MacOSX.

The problem at hand is much larger than what is described here, so comments or answers which question this approach are not useful. I know it's a bit hacky, but the alternative is a major refactor :/

Thanks in advance.

faken
  • 6,572
  • 4
  • 27
  • 28
  • 2
    actually i'm not sure if I get your question. basically NULL is the address that is equal to address of nothing else. so why not use just NULL? not to mention it could have the value 0xFFFFFFFF on some platform. – Hayri Uğur Koltuk Jul 08 '14 at 16:49
  • 9
    Why don't you allocate a static sentinel variable, not to ever write anything to it, but to reserve an address? You'll be certain that no valid address coincides with that of this sentinel. – Pascal Cuoq Jul 08 '14 at 16:51
  • To clarify, 0XFFFFFFFF is only -1 if pointers are 32 bits. Otherwise, it's just a random pointer value that could totally be normally used. – Mooing Duck Jul 08 '14 at 16:51
  • In Linux, it is "safe" to assign -1 to a pointer. "Safe" in the sense that neither `malloc`, nor `mmap` will never return this address. – rslemos Jul 08 '14 at 16:51
  • It cannot be both -1 and NULL at the same time, however. Put another way: `0xfffffff != NULL`. – CodeClown42 Jul 08 '14 at 16:53
  • 3
    +1 Pascal Cuoq. One can also take the address of any function (say, for example, `main`) as sentinel. – rslemos Jul 08 '14 at 16:53
  • @HayriUğurKoltuk, I can't use NULL because of the real problem complexity, which I didn't describe in the question. Otherwise it would be the best approach. – faken Jul 08 '14 at 16:53
  • ok then what's the real problem? – Hayri Uğur Koltuk Jul 08 '14 at 16:54
  • @PascalCuoq, I previously thought of your approach, and is certainly doable. However, this is a library, so I don't have a main() to do it nicely. And I would prefer to avoid this type of runtime initialization. Further, the library can be used in a threaded environment, so it would have to be an atomic initialization... it gets a bit ugly fast. – faken Jul 08 '14 at 16:56
  • @MooingDuck, 0xFFFFFFFF just is an example (e.g.) of what -1 would be converted to in terms of memory address. – faken Jul 08 '14 at 16:57
  • @rslemos, thanks, would this be the case in MacOSX and Windows? – faken Jul 08 '14 at 16:57
  • @faken: He said you can use "the address of any function (say, for example, main)". Emphasis on "any", not "main". I presume your library has a function. – Mooing Duck Jul 08 '14 at 17:03
  • @HayriUğurKoltuk, my real problem would be another entirely different question :D – faken Jul 08 '14 at 17:03
  • On OS X, the “address" `-1` will never be allocated at runtime. – Stephen Canon Jul 08 '14 at 17:04
  • 1
    @faken As far as I know, `&sentinel` is a “constant expression” (in the sense of C99 6.6) and can be used in an initializer. Why would any code be necessary? – Pascal Cuoq Jul 08 '14 at 17:05
  • @MooingDuck, I was answering his first comment. The function address may be a possible choice. – faken Jul 08 '14 at 17:05
  • @PascalCuoq, you're right. It's a possible solution for the problem. Please put your comment as a more complete answer bellow, after all you were the first to propose this solution. – faken Jul 08 '14 at 17:09
  • 1
    This is what sqlite does: http://www.sqlite.org/c3ref/c_static.html – michaelmeyer Jul 08 '14 at 17:29

2 Answers2

9

It is GRAS (generally recognized as safe). No major OS will allocate memory that would collide with your chosen sentinel. However, there are a few pathological cases where it would be invalid to make this assumption. For instance, a pathological C++ compiler may choose to start the stack at 0xFFFFFFFF, without violating any constraints in the spec.

Within just the scope of sane OS's, it is nearly impossible to have 0xFFFFFFFF (or its 64-bit equivalent) to be a valid memory address. It cannot be a valid memory address of an array (C++ rules forbid it). It could technically be a valid index of a char of an object allocated at the end of space, but there's two things that prevent that.

  1. Most OSs have some padding
  2. Most OSs use high memory values as Kernel memory.

If you have an opportunity to use a global value as a sentinel, it is guaranteed to be safe.

char sentinel;

char* p = "Hello";
char* p2 = 0; // null pointer
char* p3 = &sentinel;

if (p3 == &sentinel)
    cout << "p3 was a sentinel" << endl;
Cort Ammon
  • 10,221
  • 31
  • 45
  • Thanks @CortAmmon, your answer seems correct, but Pascal answered it first in a comment, so I'll mark his answer as the correct one. Do have an upvote though. Thanks again. – faken Jul 08 '14 at 17:32
  • @CortAmmon What low-integer addresses (1-4095)? I know Linux maps these as nonaccessible. I'm curious about other OSs. Indubitably, a constant address has some advantages over e.g., the address of a static. – Petr Skocik Jun 18 '19 at 17:12
4

One way to define a sentinel value that no other valid address will coincide with is a static variable:

static t sentinel;
t *p = &sentinel;

If you are going to assume a flat address space and that all pointers have the same width, you can minimize the overhead by declaring sentinel of type char instead of t.


To answer your question about (t*)-1:

  • -1 has type int. I would recommend (t*)(uintptr_t)-1, which is more likely to be the last address even for a 64-bit flat address space.

  • it is not very clean, but it should work on all commonplace architectures because, as long as the compiler intends to compare pointers using the unsigned comparison assembly instruction (as it usually does), for any object a that the compiler could hope to place at the end of the address space, &a + 1 has to compare greater than &a. In practice, this prevents the last address to be used to store anything.

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • Thanks again Pascal. I use GLib's type conversion macros (which I think do what you suggest), so converting -1 to integer is taken care of for me. – faken Jul 08 '14 at 17:33