Put a 4 Byte Integer into the first 4 char elements of vector

Question

I have a vector<unsigned char> and want to put a 4 byte Integer into the first 4 elements. Is there a simpler way in C++ than masking like this:

myVector.at(0) = myInt & 0x000000FF;
myVector.at(1) = myInt & 0x0000FF00;
myVector.at(2) = myInt & 0x00FF0000;
myVector.at(3) = myInt & 0xFF000000;

Get a pointer to the first element and `reinterpret_cast` it? But I'm not 100% convinced that isn't UB... (Have you thought about byte order?) What are you actually trying to achieve here? This feels like a really bad idea. — BoBTFish, Jun 04 '13 at 08:19
@BoBTFish, In your case the behavior at least will depend on endianness. — Lol4t0, Jun 04 '13 at 08:23
The vector contains data that is sent via USB and the first 4 bytes are to contain the length of the payload data. byte order is taken care of. — tzippy, Jun 04 '13 at 08:23
@BoBTFish The reason why `reinterpret_cast` is generally incorrect is that the alignment of the `unsigned char*` buffer inside the vector may not be strict enough. For an `unsigned char * charbuf`, doing `reinterpret_cast(charbuf)` and then dereferencing the result is only allowed if `charbuf` was aligned for this. — jogojapan, Jun 04 '13 at 08:39
If the exact replica is needed why not use memcpy()? If value must be split this way, I'd consider to work with unsigned int or do something about the sign, especially with >> using solutions. — Balog Pal, Jun 04 '13 at 10:43

score 12 · Answer 1 · edited Jun 04 '13 at 11:57

A std::vector is guaranteed to be stored as one contiguous block of data^(&ddagger;). Hence it is possible to treat a std::vector<unsigned char> in basically the same way as an unsigned char buffer. This means, you can memcpy your data into the vector, provided you are sure it is large enough:

#include <vector>
#include <cstring>
#include <cstdint>

int main()
{
  std::int32_t k = 1294323;
  std::vector<unsigned char> charvec;

  if (charvec.size() < sizeof(k))
    charvec.resize(sizeof(k));

  std::memcpy(charvec.data(), &k, sizeof(k));

  return 0;
}

Note: The data() function of std::vector returns a void* to the internal buffer of the vector. It is available in C++11 – in earlier versions it is possible to use the address of the first element of the vector, &charvec[0], instead.

Of course this is a very unusual way of using a std::vector, and (due to the necessary resize()) slightly dangerous. I trust you have good reasons for wanting to do it.

^(&ddagger;) Or, as the Standard puts it:

(§23.3.6.1/1) [...] The elements of a vector are stored contiguously, meaning that if v is a vector<T, Allocator> where T is some type other than bool, then it obeys the identity
       &v[n] == &v[0] + n for all 0 <= n < v.size().

Seeing as C++ (without Boost) still lacks a real buffer class comparable with Go's https://golang.org/pkg/bytes/#Buffer or Java's https://docs.oracle.com/javase/7/docs/api/java/nio/ByteBuffer.html I feel like stuff like this is sadly necessary quite often. However what I dislike about the memcpy approach is that it will store the data in host byte order instead of in a well defined byte order as is very useful when using this for network/file IO. — Niklas Schnelle, Sep 29 '16 at 08:29
@NiklasSchnelle sorry to solicit an opinion here, but I see no problem with the memcopy approach. 99.9% of the time you compile for a specific host, or at least a host with the same endianess. CPU's with a different endianess will have a completely different instruction sets. As for network/file I/O, well, it's alright having some pre-defined standard like they do in Java, but Java is platform agnostic. And having such an endian enforcement pleases the processors it matches with and hammers the crap out of the others because of all the byte swapping. So there's no real win here. — The Welder, Jul 02 '19 at 01:22

score 4 · Answer 2 · answered Jun 04 '13 at 08:23

You have to binary shift your values for this to work:

myVector.at(0) = (myInt & 0xFF);
myVector.at(1) = (myInt >> 8) & 0xFF;
myVector.at(2) = (myInt >> 16) & 0xFF;
myVector.at(3) = (myInt >> 24) & 0xFF;

Your code is wrong:

int myInt = 0x12345678;
myVector.at(0) = myInt & 0x000000FF; // puts 0x78 
myVector.at(1) = myInt & 0x0000FF00; // tries to put 0x5600 but puts 0x00
myVector.at(2) = myInt & 0x00FF0000; // tries to put 0x340000 but puts 0x00
myVector.at(3) = myInt & 0xFF000000; // tries to put 0x12000000 but puts 0x00

Serdalis · Answer 3 · 2013-06-04T08:34:49.763

you can do something similar to the following:

#include <vector>
#include <cstdio>

void
insert_int(std::vector<unsigned char>* container, int integer)
{
    char* chars = reinterpret_cast<char*>(&integer);
    container->insert(container->end(), chars, chars+sizeof(int));
}

int main(void)
{
    std::vector<unsigned char> test_vector;
    int test_int = 0x01020304;

    insert_int(&test_vector, test_int);

    return 0;
}

just remember to account for endieness. My machine prints the int in reverse order. 4,3,2,1

score 1 · Answer 4 · answered Jun 04 '13 at 08:28

Your solution is incorrect as you currently have it. Something like:

std::vector<unsigned char> v(sizeof(int));
int myInt = 0x12345678;

for(unsigned i = 0; i < sizeof(int); ++i) {
    v[i] = myInt & 0xFF;
    myInt >>= 8;
}

Should work. It's also more portable (doesn't assume int is 4 bytes).

score 1 · Answer 5 · answered Jun 04 '13 at 08:54

1

Here is the most compact way:

myVector.at(0) = *((char*)&myInt+0);
myVector.at(1) = *((char*)&myInt+1);
myVector.at(2) = *((char*)&myInt+2);
myVector.at(3) = *((char*)&myInt+3);

answered Jun 04 '13 at 08:54

fatihk

7,789
1
26
48

I like that style too but beware of type punning on embedded systems as this can cause undefined behavior due to byte alignment issues. – Nick Weedon Apr 16 '15 at 07:14

Put a 4 Byte Integer into the first 4 char elements of vector

5 Answers5