Algorithm to write two's complement integer in memory portably

Question

Say I have the following:

int32 a = ...; // value of variable irrelevant; can be negative
unsigned char *buf = malloc(4); /* assuming octet bytes, this is just big 
                          enough to hold an int32 */

Is there an efficient and portable algorithm to write the two's complement big-endian representation of a to the 4-byte buffer buf in a portable way? That is, regardless of how the machine we're running represents integers internally, how can I efficiently write the two's complement representation of a to the buffer?

This is a C question so you can rely on the C standard to determine if your answer meets the portability requirement.

use int instead of int32. I believe that int is 2's compliment. Also, the endianess only matters when you talk to another computer passing in the interger data. if that's an issue, then you have to do an endianess check on that machine to see if it's compatable. do that by passin in the char `1` and have them send you the int value for `1`. does this make sense? — Paul Nikonowicz, Aug 15 '13 at 04:08
@PaulNikonowicz: The C standard does not require that `int` be two's complement. The endianness does matter in this case because we're talking about writing an integer to a character buffer, and we don't want the number to be written "backwards". — Ricky Stewart, Aug 15 '13 at 04:17
@PaulNikonowicz, what does that mean? If you have a multi-byte type, you have to pick one endianness or another. — Carl Norum, Aug 15 '13 at 04:32
a char is one byte. therefore it doesn't suffer from endianness problems. — Paul Nikonowicz, Aug 15 '13 at 04:43
here's a better explanation: http://stackoverflow.com/questions/2357720/network-byte-order-conversion-with-char — Paul Nikonowicz, Aug 15 '13 at 04:44
We're writing the representation of a 4-byte integer to a 4-byte array so endianness is obviously an issue. — Ricky Stewart, Aug 15 '13 at 04:47

Carl Norum · Accepted Answer · 2013-08-15T04:31:37.960

2

Yes, you can certainly do it portably:

int32_t a = ...;
uint32_t b = a;
unsigned char *buf = malloc(sizeof a);

uint32_t mask = (1U << CHAR_BIT) - 1;  // one-byte mask

for (int i = 0; i < sizeof a; i++)
{
    int shift = CHAR_BIT * (sizeof a - i - 1); // downshift amount to put next
                                               // byte in low bits
    buf[i] = (b >> shift) & mask;  // save current byte to buffer
}

At least, I think that's right. I'll make a quick test.

edited Aug 15 '13 at 04:31

answered Aug 15 '13 at 04:20

Carl Norum

219,201
40
422
469

You're making it too hard. "Convert to twos complement" is simply the operation of conversion to an unsigned type that can cover the full range. – R.. GitHub STOP HELPING ICE Aug 15 '13 at 04:24
That's true; that would clean up a few lines. The "hard part" is the last bit in the loop. – Carl Norum Aug 15 '13 at 04:25
Edited; it works now in a couple of tests here. I'll break up the last bit to make it clearer what's going on. – Carl Norum Aug 15 '13 at 04:27
Upon more staring at this solution, I think it depends on `sizeof a` being greater than `1`. Otherwise the mask will break... you can work around that if it's a problem, I guess. – Carl Norum Aug 15 '13 at 04:33
@R.. That's very interesting if it is indeed portably true according to the C standard. I can't quite wrap my head around the reasoning yet, though. Thanks for your and Carl's help. Just wondering, is there an analogous strategy for performing the opposite operation? In other words, can we "parse" a two's complement integer from memory manually by reading it into an unsigned variable and converting it to a signed variable of the same size? – Ricky Stewart Aug 15 '13 at 06:36
@RickyStewart: Unless you assume the signed integer's format is twos complement, there's no guarantee the value will fit in it... If you're happy assuming that, just parse back to unsigned first, then convert it with this formula: `i = u<=0x7fffffff ? u : -1-(int)(-1-u);` (note: both branches of this `?:` collapse to the identity function on standard twos-complement implementations, but the 'else' expression avoids overflow and implementation-defined conversions. Any decent compiler will generate the same code as `i=u;` – R.. GitHub STOP HELPING ICE Aug 15 '13 at 08:10

score 2 · Answer 2 · answered Aug 15 '13 at 04:21

2

unsigned long tmp = a; // Converts to "twos complement"
unsigned char *buf = malloc(4);
buf[0] = tmp>>24 & 255;
buf[1] = tmp>>16 & 255;
buf[2] = tmp>>8 & 255;
buf[3] = tmp & 255;

You can drop the & 255 parts if you're assuming CHAR_BIT == 8.

answered Aug 15 '13 at 04:21

R.. GitHub STOP HELPING ICE

208,859
35
376
711

Cast to `unsigned long` was a really good idea -- made it much simpler. (+1) – Jacob Pollack Aug 15 '13 at 04:24

Eric Z · Answer 3 · 2013-08-15T04:28:11.930

0

If I understand correctly, you want to store 4 bytes of an int32 inside a char buffer, in a specific order(e.g. lower byte first), regardless of how int32 is represented.

Let's first make clear about those assumptions: sizeof(char)=8, two's compliment, and sizeof(int32)=4.

No, there is NO portable way in your code because you are trying to convert it to char instead of unsigned char. Storing a byte in char is implementation defined.

But if you store it in an unsigned char array, there are portable ways. You can right shift the value each time by 8 bit, to form a byte in the resulting array, or with the bitwise and operator &:

// a is unsigned
1st byte = a & 0xFF
2nd byte = a>>8 & 0xFF
3rd byte = a>>16 & 0xFF
4th byte = a>>24 & 0xFF

edited Aug 15 '13 at 04:28

answered Aug 15 '13 at 04:17

Eric Z

14,327
7
45
69

Actually implementation-defined, not undefined. – R.. GitHub STOP HELPING ICE Aug 15 '13 at 04:19
if `char` is an unsigned type, it's fine. – Carl Norum Aug 15 '13 at 04:19
@R.. Correct. Updated. – Eric Z Aug 15 '13 at 04:19
Yes, I want to work off the *assumption* that sizeof(int32) == 4 and CHAR_BIT == 8. The `char` thing was a typo, I meant `unsigned char`. – Ricky Stewart Aug 15 '13 at 04:19

Algorithm to write two's complement integer in memory portably

3 Answers3