4

I want to write a memcpy code which does word by word copy instead of byte by byte to increase speed. (Though I need to do some byte by byte copy for the last or few bytes). So I want my source and destination address to be aligned properly. I saw the implementation of memcpy in glibc https://fossies.org/dox/glibc-2.22/string_2memcpy_8c_source.html it does alignment only for destination address. But even if source address is not properly aligned it will result bus error (consider Alignment Checking is enabled in my cpu) I'm not sure how to make both source and destination to be aligned properly. Because if I try to align source by copying few bytes by byte by byte, it will also change the destination address, so at first the destination address which was aligned at first properly might not be aligned properly now. So is there any way to align both?. Please help me.

void  memcpy(void  *dst,  void  *src,int  size)
{
   if(size >= 8)
   {
     while(size/8) /* code will give sigbus error if src = 0x10003 and dst = 0x100000 */ 
     {
       *((double*)dst)++  =  *((double*)src)++; 
        size  =  size  -  8;
     }
   }

   while(size--)
   {
     *((char*)dst)++  =  *((char*)src)++;
   }
}
Praveen
  • 366
  • 3
  • 12
  • 2
    I don't understand how you could align the source address AND dest? I'm sorry, I just don't get it? – Martin James Feb 05 '16 at 23:20
  • @MartinJames Consider I'm passing src = 0x10003 dest = 0x100000 and size of 15. If in memcpy I want to do word by word copy, I want to align src to be multiple of 8 (64 bit OS) so next number which is multiple of 8 is 0x10006, so for first 3 bytes i'll do byte by byte copy, if I do that src will be 0x10006 but dest = 0x100003, again now dest is not a multiple of 8. this problem will continue. So how can I align both src and dest here. If its not possible is there any other way to do word by word copy? – Praveen Feb 05 '16 at 23:58
  • You can't. Assuming you are moving a block of words, if source is odd but target is even, you are in a sense aligning misaligned words. On the other hand, if source is even and target is odd, you are misaligning aligned words. You'll be either reading or writing misaligned words - ain't no way around that other than to copy a byte at a time. – 500 - Internal Server Error Feb 06 '16 at 00:16
  • @500-InternalServerError so If I pass src to be an odd number. according to this https://fossies.org/dox/glibc-2.22/string_2memcpy_8c_source.html dest is checked for memory alignment not the source. so the code will always give bus error ? – Praveen Feb 06 '16 at 00:37
  • both the source and destination addresses must be mis-aligned by the same amount (which can be 0) The function will have some leadin that determines if the mis-alignment is the same, copy 0 or more mis-aligned bytes, copy the body of the move using the larger movement instructions, then a trailer that handles copying any final bytes. At all points in the code, must be checking that there is more to copy. When the source and destination are not at the same mis-alignment, little if any can be gained by such an algorithm. – user3629249 Feb 06 '16 at 18:58

2 Answers2

1

...so at first the destination address which was aligned at first properly might not be aligned properly now. So is there any way to align both?

I found this article on memcpy optimization which I believe discusses what you are trying to do in length...

modified-GNU algorithm:

void * memcpy(void * dst, void const * src, size_t len)
{
    long * plDst = (long *) dst;
    long const * plSrc = (long const *) src;

    if (!(src & 0xFFFFFFFC) && !(dst & 0xFFFFFFFC))
    {
        while (len >= 4)
    {
            *plDst++ = *plSrc++;
            len -= 4;
        }
    }

    char * pcDst = (char *) plDst;
    char const * pcDst = (char const *) plSrc;

    while (len--)
    {
        *pcDst++ = *pcSrc++;
    }

    return (dst);
} 
ryyker
  • 22,849
  • 3
  • 43
  • 87
  • 1
    Thank you for the answer, but here the code first checks whether both the destination and the source is 4 byte aligned. if they are aligned it does long copy. else byte by byte copy. But if the size is more than 1000, for speed, till some point I'll do byte by byte copy till both my src and dest got aligned. then ill do word by word copy and for the remaining ill do byte by byte again at last. For doing word by word copy both the source and the destination needs to be word aligned. I want a way to do that. – Praveen Feb 06 '16 at 00:35
  • @Praveen: And even if you adopt a greedy approach, the check should be for whether it's the same misalignment, in which case you can correct it by initial 1/2/3-byte copy. – einpoklum Jun 12 '16 at 12:07
0

With the glibc memcpy code you included, there is no way to call the function without the memory already being aligned. If you were to write your own, the way I see it, there are two of possible alignments for the memcpy:

1) Both of the buffers are offset from a four-byte boundary by the same amount, or both are already on a four-byte boundary. (src % 4 == dst % 4) In this case, copying the first few bytes byte-by-byte then using the alignment of only the destination address is fine.

2) The buffers are not both on the same boundary. (src % 4 != dst % 4) In this case, in order to copy from one alignment to another, one word at a time, the processor would have to follow a process similar to the one below:

Load the new word
Split it into an upper half and lower half. 
Shift the upper half down
Shift the lower half up
Add the upper half the previous lower half. 
Store the combined copy to memory
Repeat

I'm not sure this would be any faster than just copying byte-by-byte. Halfword-by-halfword might be faster if your processor architecture allows it and both buffers are aligned on the halfword, although most memcpy implementations I've seen on architectures that support halfword load/store already do that.