0

I am trying to implement my own memmove function. I am trying to optimize it instead of copying byte by byte. Here is my implementation:

void* my_memmove(void *dest, const void *src, size_t len) {
    
    if((uintptr_t)dest < (uintptr_t)src)
        return memcpy(dest,src,len);
    
    long *pdest = (long*)dest +len;
    const long *psrc = (const long*)src+len;
    
    if(!((uintptr_t)dest+len & (sizeof(long)-1)) &&
      !((uintptr_t)src+len & (sizeof(long)-1))) {
        
        while(len >= sizeof(long)) {
            *--pdest = *--psrc;
            len-=sizeof(long);
        }
    }
    
    char *pdest2 = (char*)pdest;
    const char *psrc2= (const char*)psrc;
    
    while(len) {
        *--pdest2 = *--psrc2;
        len--;
    }
    
    return dest;
}

Any input on how I can make this better or if the existing code has some issues?

Support Ukraine
  • 42,271
  • 4
  • 38
  • 63
  • 5
    Odds are `memmove` already does this and handles alignment and such. If you're looking for a review of working code you want https://codereview.stackexchange.com/ – Retired Ninja Jul 19 '21 at 06:00
  • You'd need to make sure that the long pointers have correctly aligned addresses, i. e. if need be copy first and last few bytes that do not match such addresses separately still byte by byte. – Aconcagua Jul 19 '21 at 06:06
  • @RetiredNinja Thank you! That is just what I am looking for, 'll move it to the other forum. – user5114537 Jul 19 '21 at 06:06
  • @Aconcagua Thank you for you reply. To handle that case, I would have to copy byte to byte first for the unaligned bytes and then default to word copy? – user5114537 Jul 19 '21 at 06:08
  • @user5114537 Exactly. Note, too, that `long` is not always optimal data type, e. g. it is 4 bytes on both 32 bit linux and windows, 8 bytes on 64 bit linux (so far all three optimal) but only four bytes on 64-bit windows (`long long` better there). Usually pointers and CPU registers have same size, in that case `uintptr_t` is a good choice, unfortunately this does not apply for *all* architectures (e. g. separate address and data channel to memory with differing bit sizes). – Aconcagua Jul 19 '21 at 06:15
  • 1
    I’m voting to close this question because code review requests belong to the Code Review. But make sure to read [their guidelines](https://codereview.stackexchange.com/help/on-topic). – Tsyvarev Jul 19 '21 at 06:52
  • @RetiredNinja: The code is buggy (even if not yet observed in execution by OP), not working, so critiquing the defects may be a suitable question for Stack Overflow. – Eric Postpischil Jul 19 '21 at 10:10
  • @Aconcagua: Where do you see any comparison of pointers? There is a comparison of two `uintptr_t` that result from converting pointers, and that is fine—comparison of `uintptr_t` is fully defined, regardless of their provenance. – Eric Postpischil Jul 19 '21 at 10:11
  • Re “I am trying to optimize it instead of copying byte by byte”: What makes you think `memmove` copies byte by byte? The specification for it says it copies characters (meaning bytes), but it does not say it does it byte by byte. The macOS `memmove` is highly optimized, takes alignments and lengths into account, and uses SSE/AVX/AVX512 registers as suitable to copy many bytes at a time. – Eric Postpischil Jul 19 '21 at 10:22

1 Answers1

1

There is a bug in the code calling memcpy.

I changed your code like

void* my_memmove(void *dest, const void *src, size_t len) {
    
    if((uintptr_t)dest < (uintptr_t)src)
    {
        printf("memcpy\n");
        return memcpy(dest,src,len);
    }
    
    printf("my code\n");
    return NULL;
}

int main() 
{
   unsigned char a[100] = {0};
   my_memmove(&a[0], &a[25], 50);  // Overlapping dst and src
   return 0;
}

Output:

memcpy

So you call memcpy with overlapping areas.

Another bug...

What do you expect from this line:

long *pdest = (long*)dest +len;

Assume dest is 0x1000 and len is 1. It seems you expect the value 0x1001 but try this:

int main() 
{
    size_t len = 1;
    long *pdest = (long*)0x1000 +len;
    printf("%p\n", (void*)pdest);
    
   return 0;
}

Output on my system:

0x1008

which is not what you want.

Support Ukraine
  • 42,271
  • 4
  • 38
  • 63
  • Good point – still further issues left, see comments to question... – Aconcagua Jul 19 '21 at 06:28
  • @Aconcagua Several problems... Violation of strict aliasing. Possible alignment violation. Wrong address calculation. – Support Ukraine Jul 19 '21 at 06:35
  • @4386427 Thank you for your inputs, I get that I need to take care of the few mentioned issues. I will make the suggested changes and post this on the code review page. Much appreciate your help! – user5114537 Jul 19 '21 at 06:55