0

I think I'm getting rusty, so please bare with me. I'll try to be brief.

Q1. When trying to copy buffers, say buf2 to buf1, does the following check suffice against aliasing?

if (buf2 >= buf1 && buf2 < buf1 + buf1size) {
    // aliasing
}

Q2. If so, can we selectively use either memcopy() or memmove() depending on the case, like this?

// use memcpy() by default
void *(*funcp)(void *restrict, const void *restrict, size_t) = &memcpy;

// switch to memmove() when aliasing
if ( aliasing ) {
    // this cast FORCEFULLY changes the type-qualifiers of the declared parameters
    funcp = (void *(*)(void *, const void *, size_t)) &memmove;
}

// later on ...
if ( buf2size <= buf1size ) {
    (*funcp)( buf1, buf2, buf2size ); // funcp() works too, I prefer making it explicit
}

It works but I'm not comfortable at all with forcefully casting the type-qualifiers of the parameters when switching to memmove(). I think the standard confirms my doubts (can never find these darn things when I need them... using C99 btw), but since the code works I'd like to be extra sure, because if it's ok like that it would save me from duplicating buf2, work with the duplicate and freeing it when done.

Harry K.
  • 560
  • 3
  • 7
  • 4
    I think using `memmove()` unconditionally should be more efficient than doing extra checks before tham because `memmove()` should do required checking to copying the contents efficiently. – MikeCAT May 21 '21 at 11:51
  • 1
    The comparisons are in general undefined behavior by 6.5.8.5. Arbitrary pointers can't be compared, only ones which are related by some larger structure (see standard for details). – Paul Hankin May 21 '21 at 11:56
  • The answer depends on the type of the variables involved. – Lundin May 21 '21 at 11:59
  • @MikeCAT, I think I'll just do that and get done with it. Thanks! – Harry K. May 21 '21 at 12:10
  • As an addendum to @PaulHankin's comment, the equality operators (`==`, `!=`) can be used for comparing arbitrary pointers, but the relational operators (`<`, `<=`, `>`, `>=`) cannot. – Ian Abbott May 21 '21 at 12:12
  • @PaulHankin thanks for the comment, but I just checked the C99 standard and I didn't see undefined-behavior. On the contrary, it says: `When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to`. Can you please elaborate? – Harry K. May 21 '21 at 12:13
  • @Lundin, how do you mean? – Harry K. May 21 '21 at 12:14
  • @IanAbbott (or @PaulHankin) could you please point to the text defining that, when you get the chance? I briefly went over Paul's suggestion (6.5.8) but can't find the UB related text: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf – Harry K. May 21 '21 at 12:19
  • 1
    @HarryK. Specifically the last sentence of 6.5.8p5: "In all other cases, the behavior is undefined." – Ian Abbott May 21 '21 at 12:34
  • 1
    @HarryK. read the whole para 5, it lists a bunch of cases when ptr comparison is defined (arrays, pointers to struct members or unions), and has a catch-all UB for all other uses. – SergeyA May 21 '21 at 12:40
  • @IanAbbott, did some more googling and you guys are right (I still interpret the standard differently, but apparently it's only me). Thank you all for clearing it up for me! So, how do we detect mem aliasing then? – Harry K. May 21 '21 at 12:48
  • @HarryK. It depends on if they are arrays, pointers, pointers to arrays etc. – Lundin May 21 '21 at 12:49
  • @SergeyA, yes, thanks for the pointer (i should really brush up my English). Just replied to Ian too. – Harry K. May 21 '21 at 12:50
  • @Lundin, in this case they are both char buffers (can be static, dynamically allocated, buf2 may also be a literal). – Harry K. May 21 '21 at 12:50
  • You'll need to make a [mcve]. If a program can reason about the buffer sizes, it may be able to compare pointers to it. If not, then you aren't likely able to do so. Various other tricks such as converting to uintptr_t might be an option. – Lundin May 21 '21 at 12:53
  • @Lundin, not sure what you mean by "can reason the buffer sizes". Don't think a MRE is needed tbh, for brevity just assume something like strcpy(char *, const char *); it serves the general purpose of my question. I'll use MikeCAT's suggestion anyway, but it still leaves me wondering how can we detect mem aliasing in general? – Harry K. May 21 '21 at 13:01

2 Answers2

1

I believe that the term "memory areas overlap" is used more frequently.

There is no portable way of doing this kind of pointer comparisons. Standard library implementations have to compare the pointers but in this case the author of the library knows exactly how this comparison works.

Most popular glibc implementation use unsigned long long or unsigned long integers to compare the pointers (or rather perform the address artthmetics).

Q2. If so, can we selectively use either memcopy() or memmove() depending on the case, like this

It makes no sense as remove checks it itself. Most implementations I know do not follow the C standard way of moving memory areas - ie do not create any temporary arrays only decide in which direction to copy the memory areas. If memory areas do not overlap the copy operation is the same fast as when using memcpy.

Most popular implementation (gnu C library glibc):

rettype
inhibit_loop_to_libcall
MEMMOVE (a1const void *a1, a2const void *a2, size_t len)
{
  unsigned long int dstp = (long int) dest;
  unsigned long int srcp = (long int) src;

  /* This test makes the forward copying code be used whenever possible.
     Reduces the working set.  */
  if (dstp - srcp >= len)   /* *Unsigned* compare!  */
    {
      /* Copy from the beginning to the end.  */

#if MEMCPY_OK_FOR_FWD_MEMMOVE
      dest = memcpy (dest, src, len);
#else
      /* If there not too few bytes to copy, use word copy.  */
      if (len >= OP_T_THRES)
    {
      /* Copy just a few bytes to make DSTP aligned.  */
      len -= (-dstp) % OPSIZ;
      BYTE_COPY_FWD (dstp, srcp, (-dstp) % OPSIZ);

      /* Copy whole pages from SRCP to DSTP by virtual address
         manipulation, as much as possible.  */

      PAGE_COPY_FWD_MAYBE (dstp, srcp, len, len);

      /* Copy from SRCP to DSTP taking advantage of the known
         alignment of DSTP.  Number of bytes remaining is put
         in the third argument, i.e. in LEN.  This number may
         vary from machine to machine.  */

      WORD_COPY_FWD (dstp, srcp, len, len);

      /* Fall out and copy the tail.  */
    }

      /* There are just a few bytes to copy.  Use byte memory operations.  */
      BYTE_COPY_FWD (dstp, srcp, len);
#endif /* MEMCPY_OK_FOR_FWD_MEMMOVE */
    }
  else
    {
      /* Copy from the end to the beginning.  */
      srcp += len;
      dstp += len;

      /* If there not too few bytes to copy, use word copy.  */
      if (len >= OP_T_THRES)
    {
      /* Copy just a few bytes to make DSTP aligned.  */
      len -= dstp % OPSIZ;
      BYTE_COPY_BWD (dstp, srcp, dstp % OPSIZ);

      /* Copy from SRCP to DSTP taking advantage of the known
         alignment of DSTP.  Number of bytes remaining is put
         in the third argument, i.e. in LEN.  This number may
         vary from machine to machine.  */

      WORD_COPY_BWD (dstp, srcp, len, len);

      /* Fall out and copy the tail.  */
    }

      /* There are just a few bytes to copy.  Use byte memory operations.  */
      BYTE_COPY_BWD (dstp, srcp, len);
    }

  RETURN (dest);
}
0___________
  • 60,014
  • 4
  • 34
  • 74
  • I wish I could accept more than 1 answers. Thank you very much for this, it is a great answer too! On a side note, is the prototype right? Shouldn't `a1` and `a2` be `dest` and `src`, respectively? In any case, you also confirmed there is no portable way of detecting partial aliasing between buffers and the code is a welcoming bonus. Thanks for adding extra value to the thread. – Harry K. May 21 '21 at 14:04
1

For any two generic pointers, you can't really do pointer arithmetic on them. This is regulated by the additive operators C17 6.5.6/8:

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

Similar text exists for the relational operators (6.5.8) - any two pointers getting compared with them must point at the same array or otherwise the behavior is undefined.

You can in theory convert the pointers to integers in the form of uintptr_t and do arithmetic on that one. If you know for certain that buf1 points at the beginning of an array of buf1size items, then you could in theory calculate if buf2 points at the same array or not, by doing integer arithmetic on uintptr_t. But there isn't much to gain from that.

Instead you could simply write your function as

void func (char* restrict buf1, char* restrict buf2);

And push the responsibility of ensuring that the two buffers don't alias onto the caller.

As for your function pointer selection of either memcpy or memmove, then apparently the mainstream compilers (gcc, clang) seem to ignore that one version has restrict qualified pointers. If that's conforming behavior or not, I'm not sure.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Now I get what you meant by *"can reason the buffer sizes"* in your comment in my post Ludin. The sizes are indeed validated, since they are members of the same `struct` containing the buffers, but anyway the big lesson for me here is that we can't detect buffer aliasing in a portable way. On a side note, I wouldn't really like to pass aliasing responsibility to the caller, so I'll just go with @MikeCAT suggestion and use `memmove()` unconditionally in this particular case. I'm accepting the answer, thank you very much! – Harry K. May 21 '21 at 13:56
  • I doubt there are any guarantees (by the C standard) that two ranges of `uintptr_t` values, with each range derived from a pointer to different objects (not part of the same aggregate) and the lengths of the objects, can be meaningfully compared with each other for overlap to draw any conclusion about overlap of the original objects. – Ian Abbott May 21 '21 at 14:21
  • @IanAbbott No portable guarantees, no. The binary representation of pointers is implementation-defined and one has to know the specific system to make sense of it. However, the vast majority of real-world computers use linear addresses from 0 to max, without any special tricks. – Lundin May 24 '21 at 08:00