which of the two is faster: ?
1.
char* _pos ..;
short value = ..;
*((short*)_pos = va;
2.
char* _pos ..;
short value = ..;
memcpy(_pos, &value, sizeof(short));
which of the two is faster: ?
1.
char* _pos ..;
short value = ..;
*((short*)_pos = va;
2.
char* _pos ..;
short value = ..;
memcpy(_pos, &value, sizeof(short));
As with all "which is faster?" questions, you should benchmark it to see for yourself. And if it matters, then ask why and pick which you want.
In any case, your first example is technically undefined behavior since you are violating strict-aliasing. So if you had to choose without benchmarking, go with the second one.
To answer the actual question, which is faster will probably depend on the alignment of pos
. If it's aligned properly, then 1 will probably be faster. If not, then 2 might be faster depending on how it's optimized by the compiler. (1 might even crash if the hardware doesn't support misaligned access.)
But this is all guess-work. You really need to benchmark it to know for sure.
At the very least, you should look at the compiled assembly:
: *(short *)_pos = value;
mov WORD PTR [rcx], dx
vs.
: memcpy(_pos, &value, sizeof(short));
mov WORD PTR [rcx], dx
Which in this case (in MSVC) shows the exact same assembly with default optimizations. So you can expect the performance to be the same.
With gcc
at an optimization level of -O1
or higher, the following two functions compile to exactly the same machine code on x86:
void foo(char *_pos, short value)
{
memcpy(_pos, &value, sizeof(short));
}
void bar(char *_pos, short value)
{
*(short *)_pos = value;
}
The compiler might implement them both the same way.
If it does it naively, assignment will be faster.
For any practical purpose, they'll both be done in no time, and you don't need to worry.
Also note that you may have alignment problem s(_pos
may not be aligned on 2 bytes, which may crash on some processors), and type punning problems (the compiler may assume that what _pos
points to isn't changed, because you wrote using a short *
).
Does it matter? It might be that the first case will save you some cycles (depends on the compiler sophistication and optimizations). But is it worth the readibility and maintainability hit?
Many bugs are introduced because of premature optimization. You should first identify the bottleneck, and if this assignment is that bottleneck - benchmark each of the options (taking care of alignment and other issues mentioned here by others already).
The question is implementation-dependent. In practice, for doing nothing but copying sizeof(short) bytes, if one is going to be slower, it's going to be memcpy. For considerably larger data sets, if one is going to be faster, it's generally going to be memcpy.
As pointed out, #1 invokes undefined behavior.
We can see that simple assignment is certainly easier to read and write and less error prone than both. Clarity and correctness should come first, even in performance-critical areas for the simple reason that it's easier to optimize correct code than it is to fix optimized, incorrect code. If this is really a C++ question, the need for such code (casts or memcpy that bulldoze over the type system to x-ray and copy around bits) should be very, very rare.
If you are certain that there won't be an alignment issue, and you really find this is a bottleneck situation then go ahead and do the first.
If you are unhappy calling memcpy then do something like:
*pos = static_cast<char>(value & 0xff );
*(pos+1) = static_cast<char>(value >> 8 );
although if you are going to do that then use unsigned values.
The above code ensures you get little-endian too. (Obviously reverse the order of the assignments if you want big-endian). You might want a consistent endian-ness if the data is passed around as some kind of binary blob, which is, I assume, what you are trying to create.
You might wish to use something like google protocol buffers if you want to create binary blobs. There is also boost::serialize which includes binary serialization.
You can avoid breaking aliasing rules and calling a function by using a union:
union {
char* c;
short* s;
} _pos;
short value = ...
_pos->s = value;