Convert int to little-endian formated bytes in C++ for blobId in Azure

Question

Working with a base64 encoding for Azure (http://msdn.microsoft.com/en-us/library/dd135726.aspx) and I dont seem to work out how to get the required string back. I'm able to do this in C# where I do the following.

int blockId = 5000;
var blockIdBytes = BitConverter.GetBytes(blockId);
Console.WriteLine(blockIdBytes);
string blockIdBase64 = Convert.ToBase64String(blockIdBytes);
Console.WriteLine(blockIdBase64);

Which prints out (in LINQPad):

Byte[] (4 items)
| 136         |
| 19          |
| 0           |
| 0           |

iBMAAA==

In Qt/C++ I tried a few aporaches, all of them returning the wrong value.

const int a = 5000;
QByteArray b;

for(int i = 0; i != sizeof(a); ++i) {
  b.append((char)(a&(0xFF << i) >>i));
}

qDebug() << b.toBase64(); // "iIiIiA==" 
qDebug() << QByteArray::number(a).toBase64(); // "NTAwMA=="
qDebug() << QString::number(a).toUtf8().toBase64(); // "NTAwMA=="

How can I get the same result as the C# version?

Think what your loop does when `i` is 1. It shifts the 0xFF left one bit to get 0x1FE when you wanted 0xFF00. — David Schwartz, Apr 18 '12 at 05:12
is there any eaier way of getting the same result witout the for-loop? — chikuba, Apr 18 '12 at 05:14

David Schwartz · Answer 1 · 2012-04-18T09:02:09.890

4

See my comment for the problem with your for loop. It's shifting by one bit more each pass, but actually it should be 8 bits. Personally, I prefer this to a loop:

    b.append(static_cast<char>(a >> 24)); 
    b.append(static_cast<char>((a >> 16) & 0xff)); 
    b.append(static_cast<char>((a >> 8) & 0xff)); 
    b.append(static_cast<char>(a & 0xff));

The code above is for network standard byte order (big endian). Flip the order of the four operations from last to first for little endian byte order.

edited Apr 18 '12 at 09:02

answered Apr 18 '12 at 05:18

David Schwartz

179,497
17
214
278

Is there any reason why you didn't do `& 0xFF` for the first `append`? (I would do it on all of them, although on a machine with 8 bit bytes, it shouldn't make a difference.) – James Kanze Apr 18 '12 at 07:47
@JamesKanze It's not needed on the first, but it's harmless. I would expect the code generated to be exactly the same. – David Schwartz Apr 18 '12 at 08:59
It's not needed on any of them, if you're on a machine with 8 bit bytes. Since `a` has a signed type, it's needed on all of them if the machine has another size of bytes. (Also, of course: the conversion of a value in the range `[128...255]` is implementation defined if plain `char` is signed 8 bits. In practice, I've never seen an implementation where it wouldn't work.) – James Kanze Apr 18 '12 at 09:16
Is an integer conversion that overflows defined by C++? My recollection is that the results are undefined, but I get this wrong as often as right. – David Schwartz Apr 18 '12 at 09:27
your byteArray after calling toBase64() gives me "AAATiA==" which is not correct – chikuba Apr 18 '12 at 09:38
See the last two sentences of my answer. Without more context, there no way to know which is wrong. My code uses standard byte ordering. C# apparently produces little-endian. (You should triple-check which you are supposed to be using!) – David Schwartz Apr 18 '12 at 09:56
@DavidSchwartz §4.7/3: "If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined. The C standard clarifies further, and states that the "implementation-defined value" can result in a signal being raised. – James Kanze Apr 18 '12 at 11:39
@DavidSchwartz Whether the conversion is implicit or explicit doesn't change anything. If you have a value of 140, say, and you convert it to a type with a range of -128...127, the behavior is "implementation-defined", and may signal. All I can say is in the implementations I know of where `char` is not an 8 bit 2's complement, and the conversion just grabs the relevant bits, plain `char` is unsigned, so the conversion becomes well defined. – James Kanze Apr 18 '12 at 13:44
Sorry, I mean the `& 0xff` is needed on all but the first. Otherwise, the result is implementation-defined. The whole point of this code (rather than casting integer to character) is to avoid implementation-defined behavior. – David Schwartz Apr 18 '12 at 13:52

chikuba · Accepted Answer · 2012-04-18T09:39:11.097

0

I ended up doing the following:

QByteArray temp;
int blockId = 5000;

for(int i = 0; i != sizeof(blockId); i++) {
  temp.append((char)(blockId >> (i * 8)));
}

qDebug() << temp.toBase64(); // "iBMAAA==" which is correct

edited Apr 18 '12 at 09:39

answered Apr 18 '12 at 05:24

chikuba

4,229
6
43
75

Which gives different results from what David Schwartz did. Which one is correct? – James Kanze Apr 18 '12 at 07:49
this one gives the same result as the c# one, atleast the base 64 encoded strings looks the same – chikuba Apr 18 '12 at 08:55
This will convert in little-endian format. My code converts in big-endian format. The [docs](http://msdn.microsoft.com/en-us/library/de8fssa4.aspx) say the original code uses little-endian format. – David Schwartz Apr 18 '12 at 09:01
then it was little-endian i was after since I needed to replicate the c# version, not just be able to convert to an bytearray. thanks anyway :) – chikuba Apr 18 '12 at 09:40

score -1 · Answer 3 · answered Apr 18 '12 at 05:39

-1

I think this would be clearer, though may be claimed to be ill styled...

int i = 0x01020304;
char (&bytes)[4] = (char (&)[4])i;

and you can access each byte directly with bytes[0], bytes[1], ... and do what ever you want to do with them.

answered Apr 18 '12 at 05:39

BlueWanderer

2,671
2
21
36

I mean you can do "b.append(bytes[0])" things afterward. 0 to 3 if it is little-endian, 3 to 0 if it is big-endian. It's less error prone than the shift things and you can focus on the byte-order. – BlueWanderer Apr 18 '12 at 08:04
It's not portable; you have to adjust it for byte order, it supposes `int` is 32 bits, and it fails completely on the few machines which don't use 2's complement or have padding bits in their `int`. The shifting says exactly what is going on: the 8 bits in byte `n` are the bits `i...j` in the final int. – James Kanze Apr 18 '12 at 08:37
this one did not work at all. produced the base64 "BAMCAbDzMAEB" – chikuba Apr 18 '12 at 09:42
He's obviously messing up with each byte. And now it seems he doesn't actually understand what he's doing at all... How could 4 byte data be encoded to 66 bits... – BlueWanderer Apr 18 '12 at 14:05
He probably pushed the pointer, which interpreted it as a C-style, nul-terminated string, which it's not. – David Schwartz Apr 18 '12 at 14:22

Convert int to little-endian formated bytes in C++ for blobId in Azure

3 Answers3