3

I have had really big problems understand the char* lately. Let's say I made a recursive function to revert a char* but depending on how I initialize it I get some access violations, and in my C++ primer I didn't find anything giving me the right path to understand so I am seeking your help.

CASE 1 First case where I got access violation when trying to swap letters around:

char * bob = "hello";

CASE 2 Then I tried this to get it work

char * bob = new char[5];
bob[0] = 'h';
bob[1] = 'e';
bob[2] = 'l';
bob[3] = 'l';
bob[4] = 'o';

CASE 3 But then when I did a cout I got some random crap at the end so I changed it for

char * bob = new char[6];
bob[0] = 'h';
bob[1] = 'e';
bob[2] = 'l';
bob[3] = 'l';
bob[4] = 'o';
bob[5] = '\0';

CASE 4 That worked so I told myself why wouldn't this work then

 char * bob = new char[6];
 bob = "hello\0";

CASE 5 and it failed, I have also read somewhere that you could do something like

char* bob[];

Then add something to that. My question is why do some fail and other not, and what is the best way to do it?

sth
  • 222,467
  • 53
  • 283
  • 367
DogDog
  • 4,820
  • 12
  • 44
  • 66
  • 2
    Well, I am doing this as an exercise to understand pointers and such. – DogDog Feb 11 '10 at 03:51
  • String constants ("Hello") actually have a type 'char const*'. So technically it is a bad idea to assign them to variables of type 'char*'. The reason that even works is a specific exception in the standard that allows an implicit cast from 'char const*' to 'char*' just to allow backward compatibility with C. – Martin York Feb 11 '10 at 13:03
  • Well, when you've got to work with code from someone else you've got to work with what you they gave you. – DogDog Feb 16 '10 at 15:41

4 Answers4

11

The key is that some of these pointers are pointing at allocated memory (which is read/write) and some of them are pointing at string constants. String constants are stored in a different location than the allocated memory, and can't be changed. Well most of the time. Often vulnerabilities in systems are the result of code or constants being changed, but that is another story.

In any case, the key is the use of the new keyword, this is allocating space in read/write memory and thus you can change that memory.

This statement is wrong

char * bob = new char[6];
bob = "hello\0";

because you are changing the pointer not copying the data. What you want is this:

char * bob = new char[6];
strcpy(bob,"hello");

or

strncpy(bob,"hello",6);

You don't need the nul here because a string constant "hello" will have the null placed by the compiler.

Hogan
  • 69,564
  • 10
  • 76
  • 117
  • @Hogan: To be nitpicky, you are using null as in the pointer context...for a nul with one l that is '\0', for a null with two l that is NULL...that would be confusing to beginners... – t0mm13b Feb 11 '10 at 03:02
  • @tommieb75: no prob. I changed it. – Hogan Feb 11 '10 at 03:24
  • @tommieb75: actually "nul" might be wrong, that is the term for the ascii 0, but I think that `bob[5] = NULL;` and `bob[5] = 0;` would both work but `bob[5] = NUL;` would fail. I'm to lazy to test it or check the standard however. – Hogan Feb 11 '10 at 03:31
  • Is char * bob = new char[]; strcopy(bob,"hello"); SUppose to work? Cause it does. – DogDog Feb 11 '10 at 03:33
  • It will compile and run Apoc, but you are writing over random memory. This way lies buggy code. Allocate the space. – Hogan Feb 11 '10 at 03:49
  • Be aware of the demons in strncpy. if you'd done strncpy(bob,"Hello!",6); 'bob' would _not_ be nul terminated. – nos Feb 11 '10 at 11:01
  • @nos: true dat, but at the same time strcpy(bob"Hello!") would write over the end of the 6 character buffer he has. – Hogan Feb 11 '10 at 11:37
  • @nos: I always did this: `strncpy(s1,s2,x); s1[x] = 0;`, two lines, happy together. – Hogan Feb 11 '10 at 11:38
  • @Hogan no `bob` is 6 chars, "Hello!" is 6 chars. strncpy will fill the buffer but not overflow it. It just won't write a nul terminator at either `bob[5] or bob[6]` If you ever do `s1[x] = 0` make sure s1 is atleast `x+1` big. – nos Feb 11 '10 at 18:50
  • @nos: ok, I said above `strcpy(bob,"Hello!")` would write over the end of the 6 character buffer and it would. `strcpy()` copies n+1 where n is the size of the 2nd argument as we both know. So what are you saying I got wrong (besides leaving out a comma)? – Hogan Feb 11 '10 at 21:09
1
char * bob = "hello"; 

This actually translated to:

const char __hello[] = "hello";
char * bob = (char*) __hello;

You can't change it, because if you'd written:

char * bob = "hello"; 
char * sam = "hello"; 

It could be translated to:

const char __hello[] = "hello";
char * bob = (char*) __hello;
char * sam = (char*) __hello;

now, when you write:

char * bob = new char[6];    
bob = "hello\0";

First you assign one value to bob, then you assign a new value to it. What you really want to do here is:

char * bob = new char[6];    
strcpy(bob, "hello");
MSalters
  • 173,980
  • 10
  • 155
  • 350
James Curran
  • 101,701
  • 37
  • 181
  • 258
  • The key here is the `const` keyword. Constant merging optimizations are not really at issue IMHO. – Hogan Feb 11 '10 at 02:50
  • 1
    I'm not a C++ expert so this intrigues me. Does the standard mandate that duplicate string literals in a translation unit must share the same storage? – dreamlax Feb 11 '10 at 02:56
  • 1
    Just checked, 2.13.4.2 says "Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation-defined. The effect of attempting to modify a string literal is undefined.". – dreamlax Feb 11 '10 at 03:17
  • @dreamlax: exactly, that is why I called them optimizations. – Hogan Feb 11 '10 at 03:25
  • Changed "would be translated" to "could be translated" for that reason. – MSalters Feb 11 '10 at 10:42
1

Edit: The question was retagged as C++ instead of C which was originally there but re-tagged....

Ok. You have got a couple of things mixed up... new is used by C++, not C.

  • Case #1. That is declaring a pointer to char. You should be able to manipulate the string...can you show the code in what you did to do swapping characters.
  • Case #2/#3. That you got random crap, and discovered that a nul terminator i.e. '\0'...occupies every single string you'll encounter for the duration of C/C++, possibly for the rest of your life...
+-+-+-+-+-+--+
|H|e|l|l|o|\0|
+-+-+-+-+-+--+
            ^
            |
         Nul Terminator
  • Case #4 did not work as you need to use a strcpy to do that job, you cannot simply assign a string like that after calling new, when you declare a string char *s = "foo"; that is initialized at compile time. But when you do it this way, char *s = new char[6]; strcpy(s, "hello"); that gets copied into the pointer variable s.

You will eventually discover that this pointer to a memory block occupied by s will easily get over-written which will induce a fit of conniptions as you realize that you have to be careful to prevent buffer overflows...Remember Case #3 in relation to nul terminator...don't forget that, really, that string's length is 6, not 5 as we're taking into account of the nul terminator.

  • Case #5. That is declaring a pointer to array of type char, i.e. a multi-dimensional array, think of it like this
*(bob + 0) = "foo";
*(bob + 1) = "bar";

I know there is a lot to digest...but feel free to post any further thoughts... :) And best of luck in learning...

t0mm13b
  • 34,087
  • 8
  • 78
  • 110
1

You should always use char const* for pointers to string literals (stuff in double quotes). Even though the standard allows char* as well, it does not allow writing to the string literal. GCC gives a compile warning for assigning a literal address into char*, but apparently some other compilers don't.

Tronic
  • 10,250
  • 2
  • 41
  • 53