0

Given for example a char *p that points to the first character in "there is so \0ma\0ny \0 \\0 in t\0his stri\0ng !\0\0\0\0", how would Strrchr() find the last occurrence of null-character?

the following questions arises:
=>What conditions would it depend on to stop the loop!
=>I think in all cases it'll try to access the next memory area to check for its condition?at some point bypassing the string boundaries, UB! so is it safe !

please if i'am wrong feel free to correct me!

interesting
  • 149
  • 1
  • 10
  • 2
    The last one can only be the first - otherwise you have no way of knowing where the string "ends". You can keep finding zeros until you run out of memory. – Weather Vane Nov 11 '21 at 14:46
  • What do you mean by the last null character ***in the string***? If you don't know the length, when do you stop reading memory? – Tim Randall Nov 11 '21 at 14:47
  • 1
    By definition, a string is the sequence of characters preceding and including the terminating null character. If you want to find the last null character in a character array, you must know the size of the array. – William Pursell Nov 11 '21 at 14:49
  • 2
    If all you have is `char *p` pointing to the first character, then not only is it *not* an "easy problem", it is perfectly impossible. There is no information whatsoever to tell you how long the original string was. One `\0` is as good as another to terminate a string, so once you find one, you have no way of knowing it's not the real one. For all intents and purposes, the string ends at the first `\0`. – Steve Summit Nov 11 '21 at 14:54
  • The problems arose while trying to re-implement strrchr, as it is finds the last ```c``` in a string, if ```c == '\0'``` then it should finds its position, I thought well how does it find it! – interesting Nov 11 '21 at 14:55
  • To wrap it all up: **You cannot.** If you don't know the number of elements of an array, you can not know that you reached the end. This is independent of the element type of said array. -- And yes, a "string" in C means an array of `char`s with exactly one `'\0'` as its last element. – the busybee Nov 11 '21 at 14:56
  • As mentioned, the occurrence is the *only* occurrence. The nul character isn't part of the string data so it makes no sense searching for it. – Weather Vane Nov 11 '21 at 14:56
  • @home [Edit] the question and show your attempt to re-implement `strrchr`. Then we probably can tell you more than just "a string ends at the first null character" (which is correct btw). – Jabberwocky Nov 11 '21 at 14:58
  • 2
    For implementing `strrchr` do this: find the _first_ null character, that's the end of the string. Then go back from there until you've found the character or you're at the start of the string. It's 4-5 lines of C code. – Jabberwocky Nov 11 '21 at 15:02
  • 1
    A more efficient version would remember the most recently seen position of the search character and only need to traverse the string once. Still only a few lines. – Shawn Nov 11 '21 at 15:06
  • What is `Strrchr`? Did you mean `strchr`? C is case sensitive and typo sensitive. – Lundin Nov 11 '21 at 15:08
  • ```char * strrchr(const char *s, int c)``` The strrchr() function is identical to strchr(), except it locates the last occurrence of c.( from the man). – interesting Nov 11 '21 at 15:11

3 Answers3

4

It's very simple, as explained in the comments. The first \0 is the last and the only one in a C string.

So if you write

char *str = "there is so \0ma\0ny \0 \\0 in t\0his stri\0ng !\0\0\0\0";
char *p = strrchr(str, 's');
printf("%s\n", p);

it will print

so 

because strchr will find the 's' in "so", which is the last 's' in the string you gave it. And (to answer your specific question) if you write

p = strrchr(str, '\0');
printf("%d %s\n", (int)(p - str), p+1);

it will print

12 ma

proving that strchr found the first \0.

It's obvious to you that str is a long string with some embedded \0's in it. But, in C, there is no such thing as a "string with embedded \0's in it". It is impossible, by definition, for a C string to contain an embedded \0. The first \0, by definition, ends the string.


One more point. You had mentioned that if you were to "access the next memory area", that you would "at some point bypassing the string boundaries, UB!" And you're right. In my answer, I skirted with danger when I said

p = strrchr(str, '\0');
printf("%d %s\n", (int)(p - str), p+1);

Here, p points to what strrchr thinks is the end of the string, so when I compute p+1 and try to print it using %s, if we don't know better it looks like I've indeed strayed into Undefined Behavior. In this case it's safe, of course, because we know exactly what's beyond the first \0. But if I were to write

char *str2 = "hello";
p = strrchr(str2, '\0');
printf("%s\n", p+1);         /* WRONG */

then I'd definitely be well over the edge.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
2

There is a difference between "a string", "an array of characters" and "a char* pointer".

  • A C String is a number of characters terminated by a null character.
  • An array of characters is a defined number of characters.
  • A char* pointer is technically a pointer to a single character, but often used to mark a point in a C style string.

You say you have a pointer to a character (char*p) and the value of *p is 't', but you believe that *p is the first character of a C style string "there is so \0ma\0ny \0 \\0 in t\0his stri\0ng !\0\0\0\0".

As others have said, because you said this is a C style string and you don't know the length of it then the first null after p will mark the end of the string.

If this was a character array char str[40] then you could find the last null by looping from the end of the array towards the start for (i=39; i>=0; i--) BUT you don't know then length, so that won't work.

Hope that helps, and please excuse me if I have strayed into C++, its 25 years since I did C :)

Code Gorilla
  • 962
  • 9
  • 23
  • 1
    Excellent point drawing the distinction between those three related concepts. (And the rest of your answer is fine, too; I don't see any C++ bias.) – Steve Summit Nov 11 '21 at 15:42
0

In the case you present, you can never know if the null character you've found is the last one since you have no guarantee for the end of the string. As it is a c-string, it is guaranteed that the string ends with a '\0', but if you decide to go beyond that, you can't know if the memory you're accessing is yours. Accessing memory out of an array has undefined behaviour as you can either be accessing just the next object that is in memory that is yours or you could touch memory that is unallocated, but its block still belongs to your process, or you can try to touch a segment that is not yours at all. And only the third one will cause a SIGSEGV. You can see this question to check for segmentation fault without crashing your program, but your string could have ended way before you can catch it that way.

There is a reason for the strings to have an ending character. If you insist to have \0 in multiple places in your string, you can just terminate with another character, but note that all library functions will still consider the first \0 to be the end of the string.

It is considered a bad practice and a very bad thing to have multiple \0 in your strings so if you can, avoid it.