67

I'm arguing with my boss about this. They say "Yes they can be different."

Is it possible that sizeof(T*) != sizeof(const T*) for a type T?

edmz
  • 8,220
  • 2
  • 26
  • 45
P45 Imminent
  • 8,319
  • 4
  • 35
  • 78
  • 3
    I feel like the title *Could it be the case* and the sentence in your question *Is it possible* is a bit misleading or perhaps mismatched with having the language-lawyer tag. Do you want to know if it could ever happen regardless of what the standard says or do you want to know if such an implementation would be conformant? Because several of the answers reads like the prior and I can see why they may have interpreted the question that way. – Shafik Yaghmour Nov 23 '15 at 20:20

4 Answers4

82

No, they can't be different. For sufficiently different T1 and T2, sizeof(T1 *) can be different from sizeof(T2 *), but if T2 is just const T1, then:

3.9.2 Compound types [basic.compound]

3 [...] Pointers to cv-qualified and cv-unqualified versions (3.9.3) of layout-compatible types shall have the same value representation and alignment requirements (3.11). [...]

And any type T is layout-compatible with itself:

3.9 Types [basic.types]

11 If two types T1 and T2 are the same type, then T1 and T2 are layout-compatible types. [...]


Value representation is in relation to the object representation, you can't have the same value representation without also having the same object representation. The latter means the same number of bits is required.

3.9 Types [basic.types]

4 The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T). The value representation of an object is the set of bits that hold the value of type T. For trivially copyable types, the value representation is a set of bits in the object representation that determines a value, which is one discrete element of an implementation-defined set of values.44

44) The intent is that the memory model of C++ is compatible with that of ISO/IEC 9899 Programming Language C.

The point of the requirement, the reason it doesn't just say that the two types have the same object representation, is that T * and const T * not only have the same number of bits, but also that it's the same bits in T * and const T * that make up the value. This is meant to guarantee not only that sizeof(T *) == sizeof(const T *), but it means even that you can use memcpy to copy a T * pointer value to a const T * pointer value or vice versa and get a meaningful result, the exact same result you would get with const_cast.

The alignment requirements provide some additional guarantees too, but they're complicated to explain properly and not directly relevant to this question, and there are issues in the standard that undermine some of the intended guarantees, so I think that's best left ignored here.

  • 33
    Please note that this is only the standard. I have worked with an actual compiler for embedded target where `T*` and `const T*` were different in size (RAM pointer and ROM pointer). – ElderBug Nov 23 '15 at 11:32
  • 3
    @ElderBug While that's definitely interesting info, did that compiler claim conformance to any standard? –  Nov 23 '15 at 11:33
  • 4
    @hvd I don't think it did, but differentiating `T*` and `const T*` is the right choice for some small embedded targets. The standard is really flawed on this point, because it assumes RAM and ROM can be accessed with the same pointers. The target I remember had different instructions for accessing RAM (short 16-bit pointers) and ROM (long 32-bit pointers), so trying to cast between them was rejected by the compiler. – ElderBug Nov 23 '15 at 11:50
  • @Elderbug are you thinking of 8-bit AVR? If so, which compiler was it? – Andy Brown Nov 23 '15 at 11:51
  • @AndyBrown I think it was a 8-bit PIC. I'm not sure. The RAM was in 256-bytes banks. I think the compiler was not the MicroChip one. – ElderBug Nov 23 '15 at 11:53
  • 8
    In the PIC case then what you have is a `rom const T*`, not just a `const T*`, so this rule would not apply. – Zebra North Nov 23 '15 at 12:06
  • @ElderBug There are definitely valid reasons for some implementations to have differently-sized pointers to the same objects, sure. I think the standard-compatible approach for it is to use (using your example) 32-bit pointers for both `T*` and `const T*`, and use an implementation-specific type (say `__short T *`) for pointers known never to point to ROM. Or, alternatively, use both 16-bit pointers for both `T *` and `const T *`, have an implementation-specific type (say `__long T *`) for pointers to different memory regions, and require an attribute for objects in those regions. –  Nov 23 '15 at 12:18
  • @hvd The implementation-specific qualifier/attribute definitely makes sense, like the `rom` keyword MrZebra mentioned. I remember that the compiler I used had some other quirks, like forbidding function pointers (non-standard), so maybe that just it. I don't remember well. Whatever, obviously there will be some obscure compilers that don't follow the standard. – ElderBug Nov 23 '15 at 12:44
  • 17
    @ElderBug: Note that the standard requires that you can have a `const T*` to RAM. So if you need different instructions to access ROM and RAM, you can't assume that dereferencing `const T*` equals using the ROM-access instruction. This is fundamental to C++; millions of `int Foo::getX const()` methods would break otherwise. – MSalters Nov 23 '15 at 13:17
  • @ElderBug: I would call the language you were working in "C++-like", then, not C++. – BlueRaja - Danny Pflughoeft Nov 23 '15 at 17:49
  • 3
    @BlueRaja-DannyPflughoeft Well, that's a point of view. I don't really disagree, but still, many people say Visual C++ is a C++ compiler, even thought it is known to be non-conformant. – ElderBug Nov 23 '15 at 17:57
  • I think that the formulation of the standard supports the view that `const` might be considered merely a "promise" not to change data (and `volatile` is the withdrawing of the otherwise implicit promise *not* to change data unexpectedly); as such any other properties of the types remain unaffected – Hagen von Eitzen Nov 23 '15 at 19:39
  • 1
    This references two other terms with very technical meanings: value representation, and alignment requirement. Maybe you could add what they mean to your answer. Why for example could you not have an implementation that used an 8-bit T* and a 16-bit const T* whose value representation only had the first 8 bits participate, both with an 8-bit alignment requirement. – Steve Cox Nov 23 '15 at 20:17
  • In this situation , `int **p;` ... `(const int**)p` would break horribly even though the standard requires it to work – M.M Nov 23 '15 at 20:28
  • @SteveCox I've amended my answer, I won't deny it's a little bit iffy based on the literal text of the standard, but the intent is clear. –  Nov 23 '15 at 20:34
  • It's clearly intended that `T` and `const T` be aliasable (e.g. the strict aliasing rule explicitly mentions this case). ISO/IEC 9899 does say that they should be the same size. – M.M Nov 23 '15 at 20:42
  • @M.M Yes. It's also intended in C++ that `T *` and `const T *` be aliasable, as is not only hinted from what I quoted in my answer, but also because otherwise the implicit conversion from `T **` to `const T *const *` is completely and utterly useless. I recall that it used to be that the equivalent of [basic.lval]p10 didn't allow such aliasing, but looking closer, I see it's allowed now. –  Nov 23 '15 at 20:47
  • So, if I had hardware that had a "write protect" bit inside address registers that would make store operations trap, and my compiler would take care to set that bit in the representation of `const` pointers, that would be non-conforming? – Simon Richter Nov 23 '15 at 23:13
  • @SimonRichter Yes, I think so, but minor modifications could make it valid: it's valid if you only do it for objects defined as `const`, for which any modification attempt renders the behaviour undefined. –  Nov 24 '15 at 06:34
  • @ElderBug The source argument to `strcpy` is `const char*`, but it isn't only used to copy from ROM. How can that representation possibly work with function argument declarations? – Barmar Nov 24 '15 at 22:08
  • @hvd: Not that I want to give compiler writers excuses to weaken the language further, but does the Standard allow a `T const **` to be used to access a `T*`, or a `T**` to access a `T const *`, or does it merely say that a `T *const*` can access a `T*`, and a `T **` can be used to access a `T*const`? – supercat Sep 01 '16 at 22:13
8

Microchip released such a C compiler where the sizeof(T*) was 2 but sizeof(const T*) was 3.

The C compiler was not standards-compliant in a few ways, so this says nothing about this being valid (I suspect it isn't and other answers agree).

APerson
  • 8,140
  • 8
  • 35
  • 49
Joshua
  • 40,822
  • 8
  • 72
  • 132
5

Hmm, this is highly esoterical, but I'm thinking that theoretically there could be an architecture which has, say, 256 bytes of RAM in address 0 and, say, some kilobytes of ROM in higher addresses. And there could be a compiler that would create a 8-bit pointer for int *i because 8 bits is enough to hold the address of any object in the highly limited RAM area and any mutable object of course is implicitly known to be in the RAM area. Pointers of type const int *i would need 16 bits so that they could point to any location in the address space. The 8-bit pointer int *i is castable to a 16-bit pointer const int *i (but not vice versa) so the castability requirement of C standard would be satisfied.

But if there's such an architecture in existense, I'd sure like to see it (but not write code for it) :)

PkP
  • 418
  • 3
  • 10
  • 3
    @Bathsheba I instead strongly doubt because, yes, `sizeof(T*)` can be different from `sizeof(S*)` but `int` shall be able to hold all the values in [-32767, +32767], which need at least 16 bits. So `int` cannot be 8 bits long. – edmz Nov 23 '15 at 14:58
  • 1
    @black - this never mentions the size of an int itself, just the size of a pointer to int. – TLW Nov 23 '15 at 19:07
  • 4
    Firstly, this question is about C++, not about C. Secondly, C has that same requirement that I mentioned in my answer that `T *` and `const T *` have the same representation, and phrases it as: "Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements." Your hypothetical implementation violates that requirement. –  Nov 23 '15 at 19:23
  • 2
    there are a lot of Harvard architectures where data and code space is different, and constants can be stored in read only code space, therefore possible different pointer sizes – phuclv Nov 24 '15 at 06:44
  • @hvd, Yeah, sorry for any annoyment... yeah.., I didn't write the answer to pick a fight with anyone... he asked whether it was *possible*.. ("Could it be the case that..."), and frasing a question like that sort of activates the brain in a funny way... Anyway, I just think that it *might* be *possible*.. not so much whether it would be compliant or not, just that it might be possible and such a compiler could actually make sense in a way. And I think that I might have actually encountered such a compiler decades ago, for some 8051 or something. And yes, definitely C, not C++. – PkP Nov 24 '15 at 07:49
  • Don't worry, you didn't annoy at all. I apologise if I gave you that impression. –  Nov 24 '15 at 07:58
2

As far as standard C++ is concerned, they cannot differ. C and C++ are similar on this point.

But there are many architectures out there with compilers written for them that do not follow this rule. Indeed then we are not really talking about standard C++, and some folk would argue then that the language is not C++, but my reading of your question (prior to the addition of the language lawyer tag) is more about the possibility.

In which case, it is possible. You may well find that a pointer to ROM (therefore const) is a different size to a pointer to RAM (const or non const.)

If you know for sure that your code will only wind up on a standards complaint compiler then your assumption is correct. If not, then I'd think carefully about relying on their sizes being the same.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • 1
    *"Although implicit conversion from `char*` to `const char*` must be supported, the converse is not true."* That's true for all `T`, isn't it? – Baum mit Augen Nov 23 '15 at 11:48
  • 5
    Can you provide compilable code containing `sizeof(T*) != sizeof(const T*)` that supports your answer? – TobiMcNamobi Nov 23 '15 at 13:04
  • @TobiMcNamobi : There would be endless quantities of 16 bit x86 code written in an environment meeting this requirement (even if not explicitly exhibited). In Compact, Large, and Huge memory models, stack pointers are near (16 bit) and data pointers are far or huge (32 bit). Compile time constants, for instance constant strings, were stored to the data segment. As a `const char *` they could be stored in 16 bits, but can also be stored in any 32 bit `char` pointer. It is not guaranteed that the reverse cast is possible. – Eric Towers Nov 23 '15 at 14:17
  • 3
    "Unless it's explicitly spelt out," -- It is. I posted exactly where the standard says so well before you posted your answer. –  Nov 23 '15 at 18:32