12

I'm dealing with some code at work that includes an expression of the form

-(sizeof(struct foo))

i.e. the negation of a size_t, and I'm unclear on what the C and C++ standards require of compilers when they see this. Specifically, from looking around here and elsewhere, sizeof returns an unsigned integral value of type size_t. I can't find any clear reference for specified behavior when negating an unsigned integer. Is there any, and if so, what is it?

Edit: Ok, so there are some good answers regarding arithmetic on unsigned types, but it's not clear that this is in fact such. When this negates, is it operating on an unsigned integer, or converting to a signed type and doing something with that? Is the behavior to expect from the standards "imagine it's the negative number of similar magnitude and then apply the 'overflow' rules for unsigned values"?

Phil Miller
  • 36,389
  • 13
  • 67
  • 90
  • 1
    Do I want to know the reason why such a thing was brought into this world? – e.James Aug 12 '09 at 22:13
  • @eJames: Probably not, but there are certainly 'fair' reasons to do such a thing, such as using negative values to signify different interpretation of the magnitude. Before someone gets on me about premature optimization, this a) isn't code I wrote; and b) is part of the runtime for a parallel programming environment, one of whose applications accounts for ~20% of time on NSF supercomputers. In other words, every cycle counts. – Phil Miller Aug 12 '09 at 22:21
  • 1
    Is it doing a negation or is this the right side of a subtraction? The entire statement would probably be useful. – Chad Simpkins Aug 12 '09 at 22:23
  • It's a negation. I'll make that explicit. – Phil Miller Aug 12 '09 at 22:24
  • But just doing a negation does nothing in C - it will probably be optimised away. What is the negation actually used for? Surely you can post the complete statement that uses it? –  Aug 12 '09 at 22:33
  • @Neil: what happens with the result of this expression is not what I couldn't figure out. I know the conversion rules elsewhere. This kind of expression was an instance in which I didn't. – Phil Miller Aug 12 '09 at 22:48

6 Answers6

21

Both ISO C and ISO C++ standards guarantee that unsigned arithmetic is modulo 2n - i.e., for any overflow or underflow, it "wraps around". For ISO C++, this is 3.9.1[basic.fundamental]/4:

Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.41

...

41) This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type.

For ISO C(99), it is 6.2.5/9:

A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

Which means the result is guaranteed to be the same as SIZE_MAX - (sizeof(struct foo)) + 1.


In ISO 14882:2003 5.3.1.7:

[...] The negative of an unsigned quantity is computed by subtracting its value from 2n, where n is the number of bits in the pro- moted operand. The type of the result is the type of the promoted operand.

John Millikin
  • 197,344
  • 39
  • 212
  • 226
Pavel Minaev
  • 99,783
  • 25
  • 219
  • 289
  • 1
    Add in ISO 14882:2003 5.3.1.7: "[...] The negative of an unsigned quantity is computed by subtracting its value from 2n, where n is the number of bits in the pro- moted operand. The type of the result is the type of the promoted operand." – outis Aug 12 '09 at 22:38
  • The latter part is what's most relevant to the question I asked, though the former part shows how it's consistent with the approach taken elsewhere in the standard. – Phil Miller Aug 12 '09 at 23:04
  • 1
    This isn't *necessarily* 100% portably correct. See [my answer](http://stackoverflow.com/a/33621222/827263) for an unusual case where it might not be. – Keith Thompson Nov 10 '15 at 01:13
2

http://msdn.microsoft.com/en-us/library/wxxx8d2t%28VS.80%29.aspx

Unary negation of unsigned quantities is performed by subtracting the value of the operand from 2n, where n is the number of bits in an object of the given unsigned type. (Microsoft C++ runs on processors that utilize two's-complement arithmetic. On other processors, the algorithm for negation can differ.)

In other words, the exact behavior will be architecture-specific. If I were you, I would avoid using such a weird construct.

John Millikin
  • 197,344
  • 39
  • 212
  • 226
  • Is that just the way Microsoft does it, or do they do it that way because it's implementation defined? – GManNickG Aug 12 '09 at 22:16
  • I did find this page when I was searching, and wished it said something about what part of the standard it relies on. It suggests this is not strictly defined, but is it "undefined", "implementation defined", unmentioned, or something else? Also, I think the text you quote is slightly erroneous. That should be subtracting from 2^n – Phil Miller Aug 12 '09 at 22:17
  • I don't have access to the C standard, but GCC also behaves this way (on x86). Sadly, I don't have access to a system which uses a non twos-complement representation for negative numbers to test on. If I remember correctly, integer representations are implementation-defined, so this would be also. – John Millikin Aug 12 '09 at 22:23
  • 1
    ISO 14882:2003 5.3.1.7 basically says the same thing, but doesn't include the caveat about non-two's complement machines. – outis Aug 12 '09 at 22:26
  • @outis: could you quote the text in question, possibly in a separate answer? That might be what I'm really getting at. – Phil Miller Aug 12 '09 at 22:29
  • @Novelocrat: done, but as a comment for Pavel's answer, as he's basically there already. – outis Aug 12 '09 at 22:39
  • @outis: Pavel's answer really wasn't what I needed. The standards text you found was. – Phil Miller Aug 12 '09 at 22:51
  • 1
    The exact behaviour is actually defined both by the C Standard and the C++ Standard. Microsoft's comment gives the same result, unless the integer is 0 (according to Microsoft -0u would be 2^32 if unsigned int is 32 bit which is nonsense). According to the C and C++ Standard, -x for unsigned x calculates the mathematical value, then repeatedly adds or subtracts UINT_MAX + 1 until the result is in the right range. – gnasher729 Sep 01 '14 at 11:03
  • The behavior of unsigned arithmetic does not depend on whether *signed* integers are represented using two's-complement or not. However, there are cases where signed arithmetic can be relevant (see [my answer](http://stackoverflow.com/a/33621222/827263)). – Keith Thompson Nov 10 '15 at 01:15
2

negating an unsigned number is useful for propagating the lsb across the word to form a mask for subsequent bitwise operations.

Ste
  • 21
  • 1
1

The only thing I can think of is so wrong it makes my head hurt...

size_t size_of_stuff = sizeof(stuff);

if(I want to subtract the size)
    size_of_stuff = -sizeof(stuff);

size_t total_size = size_of_stuff + other_sizes;

Overflow is a feature!

  • You might be imagining a bit too narrowly. Suppose the result of the expression is assigned to an `int` or `long` - some signed type. – Phil Miller Aug 12 '09 at 22:36
1

From the current C++ draft standard, section 5.3.1 sentence 8:

The operand of the unary - operator shall have arithmetic or enumeration type and the result is the negation of its operand. Integral promotion is performed on integral or enumeration operands. The negative of an unsigned quantity is computed by subtracting its value from 2n, where n is the number of bits in the promoted operand. The type of the result is the type of the promoted operand.

So the resulting expression is still unsigned and calculated as described.

User @outis mentioned this in a comment, but I'm going to put it in an answer since outis didn't. If outis comes back and answers, I'll accept that instead.

Phil Miller
  • 36,389
  • 13
  • 67
  • 90
1

size_t is an implementation-defined unsigned integer type.

Negating a size_t value probably gives you a result of type size_t with the usual unsigned modulo behavior. For example, assuming that size_t is 32 bits and sizeof(struct foo) == 4, then -sizeof(struct foo) == 4294967292, or 232-4.

Except for one thing: The unary - operator applies the integer promotions (C) or integral promotions (C++) (they're essentially the same thing) to its operand. If size_t is at least as wide as int, then this promotion does nothing, and the result is of type size_t. But if int is wider than size_t, so that INT_MAX >= SIZE_MAX, then the operand of - is "promoted" from size_t to int. In that unlikely case, -sizeof(struct foo) == -4.

If you assign that value back to a size_t object, then it will be converted back to size_t, yielding the SIZE_MAX-4 value that you'd expect. But without such a conversion, you can get some surprising results.

Now I've never heard of an implementation where size_t is narrower than int, so you're not likely to run into this. But here's a test case, using unsigned short as a stand-in for the hypothetical narrow size_t type, that illustrates the potential problem:

#include <iostream>
int main() {
    typedef unsigned short tiny_size_t;
    struct foo { char data[4]; };
    tiny_size_t sizeof_foo = sizeof (foo);
    std::cout << "sizeof (foo) = " << sizeof (foo) << "\n";
    std::cout << "-sizeof (foo) = " << -sizeof (foo) << "\n";
    std::cout << "sizeof_foo = " << sizeof_foo << "\n";
    std::cout << "-sizeof_foo = " << -sizeof_foo << "\n";
}

The output on my system (which has 16-bit short, 32-bit int, and 64-bit size_t) is:

sizeof (foo) = 4
-sizeof (foo) = 18446744073709551612
sizeof_foo = 4
-sizeof_foo = -4
Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • There are some embedded controllers where RAM is subdivided into non-consecutive blocks of less than 256 bytes each, and that unless a program contains an object larger than that in ROM size_t could very reasonably be an 8-bit "unsigned char" (which is of course smaller than "int"). I think most such compilers in fact make size_t an unsigned int because of the expectation that it behave like one. It's too bad the Standard doesn't even describe normative behavior for such things, since it's silly to have programmers bend over backward to accommodate implementations that may not even exist. – supercat Nov 10 '15 at 20:00
  • @supercat: The standard *does* describe normative behavior for such things: unsigned types narrower than `int` are promoted to `int`. It's just not the behavior you'd prefer. (Nor, to be clear, is it the behavior I'd prefer.) – Keith Thompson Nov 10 '15 at 20:19
  • The Standard could very easily specify that implementations where `size_t` ranks below `int` should be considered non-normative; implementations which need to behave that way for some reason (e.g. compatibility with existing code) would be allowed to do so, but implementations would be strongly encouraged to make `size_t` be at least as large as `unsigned int` absent a compelling reason to do otherwise (e.g. compatibility with existing code). I can't think of any technical reasons why an implementation wouldn't be capable of making `size_t` be at least as large as `unsigned int`; can you? – supercat Nov 10 '15 at 20:32
  • Beyond the signed-vs-unsigned behavior issues, consider as well that even if a system can't allocate an object over some size (e.g. 255 bytes), it should still check the upper bits of the requested size to ensure they're zero, and return null if they aren't, rather than responding into a request for 264 bytes with an allocation for eight. – supercat Nov 10 '15 at 20:44