The strncmp()
function really only has one use case (for lexicographical ordering):
One of the strings has a known length,† the other string is known to be NUL terminated. (As a bonus, the string with known length need not be NUL terminated at all.)
The reasons I believe there is just one use case (prefix match detection is not lexicographical ordering):‡ (1) If both strings are NUL terminated, strcmp()
should be used, as it will do the job correctly; and (2) If both strings have known length, memcmp()
should be used, as it will avoid the unnecessary check against NUL on a byte per byte basis.
I am seeking an idiomatic (and readable) way to use the function to lexicographically compare two such arguments correctly (one of them is NUL terminated, one of them is not necessarily NUL terminated, with known length).
Does an idiom exist? If so, what is it? If not, what should it be, or what should be used instead?
Simply using the result of strncmp()
won't work, because it will result in a false equality result in the case that the argument with known length is shorter than the NUL terminated one, and it happens to be a prefix. Therefore, extra code is required to test for that case.
As a standalone function I don't see much wrong with this construction, and it appears idiomatic:
/* s1 is NUL terminated */
int variation_as_function (const char *s1, const char *s2, size_t s2len) {
int result = strncmp(s1, s2, s2len);
if (result == 0) {
result = (s1[s2len] != '\0');
}
return result;
}
However, when inlining this construction into code, it results in a double test for 0
when equality needs special action:
int result = strncmp(key, input, inputlen);
if (result == 0) {
result = (key[inputlen] != '\0');
}
if (result == 0) {
do_something();
} else {
do_something_else();
}
The motivation for inlining the call is because the standalone function is esoteric: It matters which string argument is NUL terminated and which one is not.
Please note, the question is not about performance, but about writing idiomatic code and adopting best practices for coding style. I see there is some DRY violation with the comparison. Is there a straightforward way to avoid the duplication?
† By known length, I mean the length is correct (there is no embedded NUL that would truncate the length). In other words, the input was validated at some earlier point in the program, and its length was recorded, but the input is not explicitly NUL terminated. As a hypothetical example, a scanner on a stream of text could have this property.
‡ As has been pointed out by addy2012,strncmp()
could be used for prefix matching. I as focused on lexicographical ordering. However, (1) If the length of the prefix string is used as the length argument, both arguments need to be NUL terminated to guard against reading past an input string shorter than the prefix string. (2) If the minimum length is known between the prefix string and the input string, thenmemcmp()
would be a better choice in terms of providing equivalent functionality at less CPU cost and no loss in readability.