8

I'm working on an embedded platform (ARM) and have to be careful when dealing with bit patterns. Let's pretend this line is beyond my influence:

uint8_t foo = 0xCE;          // 0b11001110

Interpreted as unsigned this would be 206. But actually it's signed, thus resembling -50. How can I continue using this value as signed?

int8_t bar = foo;            // doesn't work

neither do (resulting in 0x10 or 0x00 for all input values)

int8_t bar = static_cast<int8_t>(foo);
int8_t bar = reinterpret_cast<int8_t&>(foo);

I just want the bits to remain untouched, ie. (bar == 0xCE)

Vice versa I'd be interested how to get bit-patters, representing negative numbers, into unsigned variables without messing the bit-pattern. I'm using GCC.

jleahy
  • 16,149
  • 6
  • 47
  • 66
Sven-de
  • 83
  • 1
  • 3
  • 1
    Converting from signed to unsigned will keep the bit pattern, so signed char -50 will become 206 unsigned. Going from unsigned to signed will be the same number if it can be represented as signed, otherwise it's implementation-defined. – JohnPS Sep 11 '11 at 03:13
  • 1
    My debugger fooled me, int8_t bar = foo; works just fine on my platform. Sorry about this, but still many thanks for the insights it gave me. – Sven-de Sep 12 '11 at 11:24
  • 1
    @Sven-de: yeah, it may be implementation-defined, but most implementations are going to take the easy way out (rather than saturating it at its maximum of +127) – Jason S Sep 12 '11 at 19:04
  • @Sven-de: If that's the useful-to-you answer, could you either put it as an answer and accept it, or accept Kerrek's very similar answer with a comment? It's confusing to have this question marked as not yet having an accepted answer when it is now answered. Thanks! – Brooks Moses Sep 15 '11 at 00:03

5 Answers5

8

The following works fine for me, as it should though as the comments say, this is implementation-defined:

int x = (signed char)(foo);

In C++, you can also say:

int x = static_cast<signed char>(foo);

Note that promotion always tries to preserve the value before reinterpreting bit patterns. Thus you first have to cast to the signed type of the same size as your unsigned type to force the signed reinterpretation.

(I usually face the opposite problem when trying to print chars as pairs of hex digits.)

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • This is, technically, UB. It results in overflow. – Oliver Charlesworth Sep 10 '11 at 19:00
  • Interesting - I thought this was well-defined... what about the other way, casting a signed char to an unsigned one? – Kerrek SB Sep 10 '11 at 19:34
  • 3
    unsigned to signed is implementation-defined if the number can not be represented as signed. signed to unsigned becomes "least unsigned integer congruent to the source integer (modulo 2^n)." In two's complement representation this just keeps the same bit pattern. See 4.7 Integral Conversions. – JohnPS Sep 11 '11 at 03:21
  • @JohnPS: I see, thanks. I edited the answer to say that it's implementation-defined. – Kerrek SB Sep 11 '11 at 11:50
6
uint8_t foo = 0xCE;          // 0b11001110
int8_t bar;
memcpy( &bar, &foo, 1 );

It even has the added bonus that 99% of compilers will completely optimise out the call to memcpy ...

Goz
  • 61,365
  • 24
  • 124
  • 204
3

Something ugly along the lines of this?

int8_t bar = (foo > 127) ? ((int)foo - 256) : foo;

Doesn't rely on a conversion whose behaviour is undefined.

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • I think you possibly meant 256, not 128, yes? And there's no need for the ugly C cast. – Cheers and hth. - Alf Sep 10 '11 at 18:57
  • 1
    @Alf: Indeed, you're probably correct that the usual arithmetic conversions mean that the cast (whether C or C++ style) is unnecessary. But these rules are difficult to remember, so I'd prefer to leave it explicit. – Oliver Charlesworth Sep 10 '11 at 19:04
0

With GCC chances are that unsigned values are two's complement, even on your embedded platform.

Then the 8-bit number 0xCE represents 0xCE-256.

Because two's complement is really just modulo 2n, where n is the number of bits in the representation.

EDIT: hm, for rep's sake I'd better give a concrete example:

int8_t toInt8( uint8_t x )
{
    return (x >= 128? x - 256 : x);
}

EDIT 2: I didn't see the final question about how to get a bit pattern into an unsigned variable. That's extremely easy: just assign. The result is guaranteed by the C++ standard, namely that the value stored is congruent (on-the-clock-face equal) to the value assigned, modulo 2n.

Cheers & hth.,

Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
  • Did you mean "signed values are two's complement"? Unsigned values, of course, have no concept of sign & no other variance in bit representation. – underscore_d Jul 24 '16 at 09:20
0

You can access the representation of a value using a pointer. By reinterpreting the pointer type, not the value type, you should be able to replicate the representation.

uint8_t foo = 0xCE;          
int8_t bar = *reinterpret_cast<int8_t*>(&foo);
D Krueger
  • 2,446
  • 15
  • 12
  • This violates strict aliasing & so can be optimised right out of the program or otherwise UBify it. The only standard way to 'reinterpret' representation requires trivially copyable types and `memcpy` to the destination. Hopefully, the compiler optimises that to the machine-code reinterpretation you always wanted, but which `reinterpret_cast` (perhaps sadly) is not guaranteed to provide. It's a very misunderstood keyword, probably because strict aliasing makes it nearly useless vs expectations. The exception is that `reinterpret_cast`ing to a `char` type to read representation is always valid. – underscore_d Jul 24 '16 at 09:13
  • @underscore_d I thought this was allowed as "a type that is the signed or unsigned type corresponding to the dynamic type of the object." May just be my misreading of the clause, however. – D Krueger Jul 25 '16 at 18:34
  • Yup, it seems you're right. Thanks for the reminder of that signed/unsigned line. However, `reinterpret_cast` isn't necessary here and probably provides lower likelihood (guarantee) of success than a simple `static_cast`: http://stackoverflow.com/a/1751368/2757035 ...or am I just failing at searching when trying to find a specification of how `reinterpret_cast` converts between signednesses? If not specified, I'd assume it does (A) anything it wants... or at least (B) what `static_cast` does, in which case, there's no need for the pointer round-trip. – underscore_d Jul 25 '16 at 18:39
  • @underscore_d A `static_cast` on the pointer will fail because `uint8_t` and `int8_t` are distinct types, so the `reinterpret_cast` is required here. N3797 says: "An object pointer can be explicitly converted to an object pointer of a different type." The use of the pointer, instead of the value, is because the original question indicated that the bit pattern must be the same in the unsigned and signed values. With two's-complement it doesn't matter, but using the pointer is required for other representations. – D Krueger Jul 25 '16 at 19:40
  • As indicated by the answer I linked, I was talking about a `static_cast` between the _values_, not pointers. And with other representations, AFAICT the Standard only says the mapping performed by `reinterpret_cast` is implementation-defined at best, which doesn't seem to strongly guarantee what the OP wants, just a _possible_ weak guarantee that might be given by their implementation... for now. – underscore_d Jul 25 '16 at 19:44
  • @underscore_d The "mapping" refers to the implementation-defined representation of the pointer. Otherwise, it's well-defined. – D Krueger Jul 26 '16 at 15:41