4

I'm testing an example about strings in C++ from "C++ Premiere" book.

const int size = 9;
char name1[size];
char name2[size] = "C++owboy";   // 8 characters here

cout << "Howdy! I'm " << name2 << "! What's your name?" << endl;

cin >> name1;  // I input "Qwertyuiop" - 11 chars. It is more than the size of name1 array;

// now I do cout
cout << "Well, your name has " << strlen(name1) << " letters";  // "Your name has 11 letters".
cout << " and is stored in an array of " << size(name1) << " bytes"; // ...stored in an array of 9 bytes.

How it can be that 11 chars are stored in an array just for 8 chars + '\0' char? Is it becomes wider on compilation? Or the string is stored somewhere else?

Also, I can't do:

const int size = 9;
char name2[size] = "C++owboy_12345";   // assign 14 characters to 9 chars array

But can do what I've written above:

cin >> name1;   // any length string into an array of smaller size

What is the trick here? I use NetBeans and Cygwin g++ compiler.

Green
  • 28,742
  • 61
  • 158
  • 247
  • 8
    Don't use `char` arrays, use `std::string`. –  Jul 12 '12 at 15:32
  • 3
    The trick is called **buffer overflow** and is considered a vulnerability with security implications in many scenarios. Not every buffer overflow leads to a crash or leads to a crash immediately. – 0xC0000022L Jul 12 '12 at 15:33
  • 1
    The behavior is undefined, as others have said. There's a pretty good chance, though, that the extra characters are going into `name2`. You might try printing that. – Fred Larson Jul 12 '12 at 15:37
  • 2
    This is why programmers should learn the rudiments of assembly language: stepping through this code in a debugger would not only answer his question, but also fill in other (apparent) gaps in his understanding. – egrunin Jul 12 '12 at 16:00
  • 1
    This is why C++ programmers should always prefer using objects like string and vector rather than fixed-size arrays. Sometimes you do need an array for interacting with system calls or other libraries but those are the exceptions. – Zan Lynx Feb 06 '14 at 20:30

4 Answers4

9

Writing more entries into an array than the size of the array allows invokes undefined behavior. The computer might store that data anywhere, or not store it at all.

Typically, the data is stored in whatever happens to come next in memory. That might be another variable, an instruction stream, or even a control register for the bomb underneath your chair.

To put it simply: your have coded a buffer-overflow bug. Don't do that.


Just for fun: Undefined behavior is behavior that the C++ standard does not comment on. It can be anything, since the standard places no constraints on it.

In one particular case, the behavior increases my bank balance from $10 to $1.8 billion: http://ideone.com/35FQW

Can you see why that program might behave that way?

Robᵩ
  • 163,533
  • 20
  • 239
  • 308
5

name1 is given an address in memory. If you write 80 bytes to it, it will write over 80 bytes in memory starting at that location. If there is a variable stored at name1's address + 20, then it will have its data overwritten by your write of 80 bytes to name1. That's just the way things work in C/C++, these are called buffer overflows and can be used to hack programs.

Rocky Pulley
  • 22,531
  • 20
  • 68
  • 106
4

This is a typical buffer overflow. This is why you're always supposed to check the size of input if you're putting it in a buffer. Here is what's happening:

In C++ (and C), array names are just pointers to the first element of the array. The compiler knows the size of the array and will do some compile-time checks. But, during runtime, it'll just treat it as a char*.

When you did cin >> name1, you passed a char* to cin. cin doesn't know how big the allocated space is -- all it has is a pointer to some memory. So, it assumes you allocated enough space, writes everything, and goes past the end of the array. Here's a picture:

Bytes   1  2  3  4  5  6  7  8  9  10 11 12 13 14 15
Before  |-----name1 array-------|  |--- other data-|
After   Q  w  e  r  t  y  u  i  o  p  \0 |-er data-|

As you can see, you have overwritten other data that was stored after the array. Sometimes this other data is just junk, but other times it's important and could mean a tricky bug. Not to mention, this is a security vulnerability, because an attacker can overwrite program memory with user input.

The confusion about sizes is because strlen will count bytes until it finds a '\0' (null terminator), meaning it finds 10 characters. On the other hand size(name1) uses the actual size of the array, provided by the compiler.

Because of these problems, whenever you see a C function that takes an array as an argument, it also takes the array size. Otherwise there's no way of telling how big it is. To avoid these problems, it's much better to use C++ objects like std::string.

Ari
  • 1,102
  • 9
  • 17
  • 1
    array names aren't actually just pointers to the first array element, but an array is converted implicitly to a pointer to the first element in most contexts. – bames53 Jul 12 '12 at 17:21
  • Thanks for the full explanation, it will do the OP some good. – egrunin Jul 13 '12 at 03:48
3

No trick here:) you are writing over memory outside of buffer, this is an undefined bahaviour

marcinj
  • 48,511
  • 9
  • 79
  • 100