11

I have a scenario in which I need to copy the contents of a raw dynamically allocated uint8_t array into a vector (which is guaranteed to be empty whenever this scenario happens).

vector<uint8_t> myVector;
const uint8_t* myRawArray;

It is really important to me that the copy operation is as efficient as possible and portable (various compiler versions might be used).

One approach I thought of using is this:

myVector.reserve(byteCount);
myVector.insert(myVector.begin(), myRawArray, myRawArray + byteCount);

Any ideas on how the speed of that compares to this one:

myVector.resize(byteCount);
memcpy(myVector.data(), myRawArray, byteCount);

I guess memcpy should be fast but then I am forced to use resize which needs to zero-out the memory, so I guess it will slow it down a bit..

Also, any other suggestions?

mk33
  • 351
  • 1
  • 2
  • 6
  • 5
    What about measuring? – πάντα ῥεῖ Jan 01 '16 at 00:54
  • I've seen a few subtly different versions of this question. There isnt much performance in it either way and the fastest solution varies with compiler and hardware. I'd probably just use the insert version as its the most idiomatic C++. If we have specific hardware and compiler to target then go ahead and measure on that – RichardBruce Jan 01 '16 at 00:59
  • Another option is: `myVector.reserve(byteCount); std::copy(myRawArray, myRawArray + byteCount, std::back_inserter(myVector));` – Remy Lebeau Jan 01 '16 at 00:59
  • 2
    Your ideas are sane. Measure them to find out which is fastest for your data on your target platform. (But I would expect close to _nil_ difference, generally.) – Lightness Races in Orbit Jan 01 '16 at 00:59
  • 1
    Another option, if you can declare the vector at the time of the copy: `std::vector myVector(myRawArray, myRawArray + byteCount);` Otherwise, `swap()` a temp vector: `std::vector tmp(myRawArray, myRawArray + byteCount); myVector.swap(tmp);` – Remy Lebeau Jan 01 '16 at 01:01
  • Thanks, I am aware that I can measure them, at least on platforms that I have available right now. But I was wondering if there is any approach that in general works fastest. – mk33 Jan 01 '16 at 01:05
  • The `insert` approach avoids zeroing of the vector buffer and so with an ideal implementation it would be fastest (least required work). But this depends very much on the implementation. So you need to measure anyway, but if you can't, then go with the `insert` as a default. Unless you can simply initialize the vector with the data. – Cheers and hth. - Alf Jan 01 '16 at 01:44

3 Answers3

18

If you don't need to create the vector before the copy takes place, you could always pass the raw array to the constructor of your vector:

std::vector<uint8_t> myVector(myRawArray, myRawArray + byteCount);

If you do need to construct the vector beforehand, the following is an option:

std::vector<uint8_t> myVector;
// ... do some stuff ...
// Now, we're ready for the copy, and byteCount is known.
myVector.reserve(byteCount);
std::copy(myRawArray, myRawArray + byteCount, std::back_inserter(myVector));

I would suggest using std::copy unless memcpy is proven to be faster. std::copy is safer and more idiomatic in C++ code, but don't be afraid to use memcpy if it really is proven to be faster. The speed difference will most likely change with different compilers.

I hope this helps.

Ryan McCleary
  • 371
  • 2
  • 6
  • @M.M: I wasn't aware that std::vector's range constructor zero-initializes the vector's elements. What is the rationale behind this? – Ryan McCleary Jan 01 '16 at 05:56
1

memcpy() is usually written in assembly and it is very optimized so you should know that memcpy will be fast. vector::insert is usually implemented as having a call to memcpy under the hood but it does need to check if there is enough space in the vector for the insertions to take place without any reallocations. I have not profiled this but I bet the first version with the call to reserve is faster.

An alternative to this would be to use std::copy which has been found to be slightly faster than using memcpy in some cases, you can be sure that if possible it also makes a call to memcpy or does something better. So performance issues should not be a problem with it. It will also take care of increasing the size of the vector to match your requirement.

Curious
  • 20,870
  • 8
  • 61
  • 146
0

Thanks all for your input to my issue I have resolve the problem by doing the following changes to my structure and implementing it like this

   struct YUV_Buffer
   {
void *pCacheBuf = nullptr;
int frameID = 0;
int height = 0;
int width = 0;
void CopyBuf(BYTE * pBuf, int sizBuf)
{
    pCacheBuf = new BYTE[sizBuf];
    memcpy(pCacheBuf, pBuf, sizBuf);
}

YUV_Buffer(BYTE * pBuf, int nFrameID, int nHeight, int nWidth)
    : frameID(nFrameID), height(nHeight), width(nWidth)
{
    CopyBuf(pBuf, 8 * 1024 * 1024);
}

YUV_Buffer(const YUV_Buffer & yuvbuf)
    :frameID(yuvbuf.frameID), height(yuvbuf.height), width(yuvbuf.width)
{
    CopyBuf((BYTE*)yuvbuf.pCacheBuf, 8 * 1024 * 1024);
}
~YUV_Buffer() {
    delete[]pCacheBuf;
    pCacheBuf = NULL;
}
 };

I then implement it like this:

  YUV_Buffer nBuffer = YUV_Buffer((BYTE*)pSysFrame, pmfxInSurface->Data.FrameOrder, pmfxInSurface->Info.CropH, pmfxInSurface->Info.CropW);
mBuffer.emplace_back(nBuffer);

Hope this may help others also complements to sarabande from expert-exchange for there help and input.

Regards Nigel

nchannon
  • 7
  • 1