In what platform memmove and memcpy can have significant performance difference?

Question

I understand that memmove and memcpy difference is that memmove handles the memory overlap case. I have checked the implementation in libgcc and got this article [memcpy performance] from the intel website.

In libgcc, the memmove is similar to memcpy, both just go though one byte and byte, so the performance should be almost same even after optimization.

Someone has measured this and got this article memcopy, memmove, and Speed over Safety. Even I don't think the memmove can be faster than memcpy, but there should be no big difference at least on Intel platform.

So in what platform and how, memcpy can be significantly faster than memmove, if there is none, why providing two similiar functions instead of just memmove, and lead to a lots of bug.

Edit: I'm not asking the difference of memmove and memcpy, I know memmove can handle overlap issue. The question is about is there really any platform where memcpy is faster than memmove?

If I remember well there is some issues related to overlapping memory addresses — Felice Pollano, Oct 25 '13 at 09:24
There are lots of questions already on `memcpy` vs. `memmove` (see the "Related" bar on the right). Are you sure that your question isn't already covered by one of those? — Oliver Charlesworth, Oct 25 '13 at 09:25
Commenters, please read the *whole* question. It seems he knows what the difference in the definition is, but that it seems in practice to make no difference. The question is "on which platforms *does* it matter?" — BoBTFish, Oct 25 '13 at 09:31
@Oli Charlesworth, I'm not asking the difference of memmove and memcpy, I know memmove can handle overlap issue. The question is about does there really any platform memcpy is faster than memmove? — ZijingWu, Oct 25 '13 at 09:31
@BoBTFish *Reading* a question before marking it as a duplicate is not the SO way. You're getting in the way of Progress(tm). `` — jalf, Oct 25 '13 at 09:48
@Suma, It should not because of historic reason, because you can just fix memcpy to make it safe instead of introduce memmove. — ZijingWu, Oct 25 '13 at 10:03
It's been done, C11 added memcpy_s. https://en.cppreference.com/w/c/string/byte/memcpy — Hans Passant, Aug 23 '18 at 01:44

score 3 · Accepted Answer · edited May 23 '17 at 11:45

3

There is at least one recent case where the constraint of non-overlapping memory is used to generate faster code:

In Visual Studio memcpy can be compiled using intrinsics, while memmove cannot. This leads in memcpy being much faster for small regions of a known size because of removing the function call and setup overhead. The implementation using movsd/movsw/movsb is not suitable for overlapping blocks, as it starts copying at the lowest address, incrementing the edi/esi during the copy.

See also Make compiler copy characters using movsd.

The GCC also lists memcpy as implemented as built-ins, the implementation and motivation is likely to be similar to that of Visual Studio.

edited May 23 '17 at 11:45

Community

1
1

answered Oct 25 '13 at 11:02

Suma

33,181
16
123
191

But Why movsd/movsw/movsb doesn't suit for overlap copy? If all parameter is known, compiler can also chose movsd or movsw – ZijingWu Oct 25 '13 at 12:30
ok I understand it. Most of the time only the block size is compile time constant. – ZijingWu Oct 25 '13 at 12:55
movsX instructions always move in one direction, starting at the lowest address and incrementing the edi/esi during the copy. The D/W/B only selects unit of copy (DWORD, WORD, BYTE). – Suma Oct 25 '13 at 14:03
1

movsb and friends can copy in the other direction. They are controlled by the direction flag. There's no reason memove cannot be as fast as memcpy. Though not every implementation does so, that's for sure. – Yan Zhou Dec 10 '16 at 07:07
"There's no reason memove cannot be as fast as memcpy" is not completely correct, as mentioned by "The standard is not about the Intel platform" comment. It might be possible to make it as fast on one platform, but there's no guarantee on all possible platforms. – Mike Kaganski Jan 13 '22 at 11:12

Ehsan · Answer 2 · 2016-12-10T03:20:10.647

-3

Good practice: In general, USE memmove only if you have to. USE it when there is a very reasonable chance that the source and destination regions are over-lapping.

Otherwise USE memcpy. memcpy is more efficient.

Reference: https://www.youtube.com/watch?v=Yr1YnOVG-4g Dr. Jerry Cain, (Stanford Intro Systems Lecture - 7) Time: 36:00

edited Dec 10 '16 at 03:20

answered Dec 10 '16 at 02:38

Ehsan

1,338
14
13

In what platform memmove and memcpy can have significant performance difference?

2 Answers2

Linked