How does this condition checking works?

Question

I wrote a function for string reversal via pointers, The code is running fine, without bugs, but there are some things i want to know.

Here's my code:

char * xstrrev(char *s1, char *s2){
 register char *p = s1;
 register char *q = s2;
 char *r  = s2;
  do{

     *(s2++) = *(p++); //first copy the string, afterwards, replace it

  }while(*p);
  p--; //to eliminate trailing '\0' while reversing.
  do {
    *(q++) = *(p--); //replace contents by reverse contents,
  }while(*q);
  return r;
}

Here, in the third last line, the *q must have a value '\0' , because, we copied the exact string previously, So, '\0' must have been copied.

However, when i replace my

*(s2++) = *(p++);

with

p++;

i.e, i only increase p to the end-of string, and do not copy the string to s2, the condition

while(*q)

still works. In this condition, *q is not supposed to have \0, right? How does this condition work then?

It's same when i replace, while(*q) with while(*q!='\0')

EDIT:: It's called as:

char  a[110]= "hello";
char f[116];
xstrrev(a,f); //reverse a and put to f
puts(f);

This is so undefined behavior! Change `char f[116];` to `char f[116] = "helloworld";` to see how it breaks. — Sergey Kalinichenko, Apr 11 '13 at 02:59
Why are you setting `a` to a palindrome? How can you tell if it was actually reversed? :) — Barmar, Apr 11 '13 at 03:04
@Barmar It's the same for any other string, wait, i'll edit that. — cipher, Apr 11 '13 at 03:05
My comment has nothing to do with the question, it was just incidental. — Barmar, Apr 11 '13 at 03:06

Barmar · Accepted Answer · 2013-04-11T02:56:47.570

3

If it's working, it's totally accidental. The string s2 the caller supplied may have \0 at the end of s2. But you can't depend on this, it depends on how the caller initialized the string it's passing.

Another possibility is that the memory right BEFORE a happens to contain \0. This can happen if you have something like:

char something[] = "foo";
char a[110] = "nacan";

The memory for something is right before the memory for a, so the something's trailing null will be before the first byte of a.

What happens in this case is that the loop copies this \0', but it doesn't stop immediately. It keeps on copying until it eventually runs in to a\0in*q. But when you look atf, you just see the reverse ofa`, because of this null byte was copied.

If you want to see this happening, single-step your function in the debugger.

None of this is guaranteed by the C language, it's just how memory layout is often done.

edited Apr 11 '13 at 02:56

answered Apr 11 '13 at 02:40

Barmar

741,623
53
500
612

Nope. The caller does not supply that. See my Edit . – cipher Apr 11 '13 at 02:42
What do you see when you look at `s2` in a debugger at the beginning of the function? – Barmar Apr 11 '13 at 02:44
What's the value of `f` before you call the function? – Barmar Apr 11 '13 at 02:48
I wrote, `puts(f)` after initializing `f` resulted in: GMING-EH-TDM1-SJLJ-GTHR-MINGW32 – cipher Apr 11 '13 at 02:50
I think you are fortunate that whatever is on the stack immediately before s2's address has the high byte 0. – Fred Apr 11 '13 at 02:52
@Fred : I think that would not be the case, on *each and every* run !! – cipher Apr 11 '13 at 02:53
Now why, f contains these characters, if nothing is assigned it it? – cipher Apr 11 '13 at 02:55
@cipher: because its in the c standard – Apr 11 '13 at 02:55
@cipher, why? The program clearly isn't using the terminating NULL from p, and it copies bytes until it copies over a '\0'. You have an extra 110 characters before overwriting the end of f. Surely you'll find a zero byte in that range in front of p, but it is just luck, because you don't get it from p. – Fred Apr 11 '13 at 03:00
@Barmar . But p is decremented first p-- to remove the '\0'. And while copying the contents of p to s2,(in reverse order), when p reaches first, it's not sure that the memory before it is '\0' , so, it must have shown some random characters afterwards, isn't it so? – cipher Apr 11 '13 at 03:01
It's not using the terminating null from `p`, it's using the terminating null from the string BEFORE p in memory. – Barmar Apr 11 '13 at 03:02
@Fred , but we cannot be sure that something before p, is a null character, which must have resulted in q containing some other characters before ending – cipher Apr 11 '13 at 03:02
@cipher: you are right..that's why you have to check or use a secure string library – Apr 11 '13 at 03:04
@0A0D , Sorry, didn't get what you meant by `have to check` can you elaborate ? – cipher Apr 11 '13 at 03:07
1

Functions that write into user-supplied strings should be given the length of the output string, so they won't write past the end of it. That's called a "buffer overflow", and is the cause of many security exploits. – Barmar Apr 11 '13 at 03:09
@Barmar : Thanks for the info, but how does that answer my question? :) – cipher Apr 11 '13 at 03:10
1

I think that's what he meant by "have to check". The caller tells you the length, and you make sure you don't loop any more than that. – Barmar Apr 11 '13 at 03:15
@cipher: What @Barmar is saying is that arrays decay into pointers when passed into functions in C. You can't check the size of the array by simply doing a `sizeof` either because that will only return you the size of the pointer. So if you know the size of the array, then you should pass it in so you don't go beyond the memory allocated to the array. Otherwise, you will go out of bounds and start reading garbage (which might include a `0` or `\0`, it all depends what is on the stack at the time). – Apr 11 '13 at 11:57
@0A0D Yeah. So i am asking why not a garbage value shown in output? . Why it always shows the reverse of given string w/o some extra garbage? – cipher Apr 11 '13 at 13:06
Probably because the layout of the stack frame is similar in all the cases you tried, so there's always a zero byte just before `a`. – Barmar Apr 11 '13 at 13:17

How does this condition checking works?

1 Answers1