3

Is the implementation of strnlen that follows invalid?

size_t strnlen(const char *str, size_t maxlen)
{
    char *nul = memchr(str, '\0', maxlen);
    return nul ? (size_t)(nul - str) : maxlen;
}

I assume that memchr may always look at maxlen bytes no matter the contents of those bytes. Does the contract of strnlen only allow it to look at all maxlen bytes if there is no NUL terminator? If so, the size in memory of str may be less than maxlen bytes, in which case memchr might try to read invalid memory locations. Is this correct?

JudeMH
  • 485
  • 2
  • 10
  • With care, yes! I don’t think `memchr()` will read beyond the first occurrence of the sought character. – Jonathan Leffler Jan 19 '20 at 03:07
  • 1
    *"...in which case memchr might try to read invalid memory locations. Is this correct?"* -- if that happens, you would have invoked UB with `strlen` as well as there is no `'\0'`. So it is somewhat a *circular-hypothetical*. – David C. Rankin Jan 19 '20 at 04:28

2 Answers2

4

Yes, the implementation posted is conforming: memchr() is not supposed to read bytes from str beyond the first occurrence of '\0'.

C17 7.24.5.1 The memchr function

Synopsis

#include <string.h>
void *memchr(const void *s, int c, size_t n);

Description

The memchr function locates the first occurrence of c (converted to an unsigned char) in the initial n characters (each interpreted as unsigned char) of the object pointed to by s. The implementation shall behave as if it reads the characters sequentially and stops as soon as a matching character is found.

Returns

The memchr function returns a pointer to the located character, or a null pointer if the character does not occur in the object.

memchr may be implemented with efficient techniques that test multiple bytes at a time, potentially reading beyond the first matching byte, but only if this does not cause any visible side effects.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
2

I assume that memchr may always look at maxlen bytes no matter the contents of those bytes.

That assumption is wrong. From POSIX:

Implementations shall behave as if they read the memory byte by byte from the beginning of the bytes pointed to by s and stop at the first occurrence of c (if it is found in the initial n bytes).