0

After days of attempting to create a shell I ask for a bit of help. I have started over 4 or so times with different data structures and plea for a solution to the below problem. I have a string that I need to break into individual arguments and have a pointer to that. I ultimately pass args to an exec function but since i cant seem to fill the args correctly i am getting funny results, here is a simplified version of whats happening

char* args[100];
int counter=0;
string temp = "some text and stuff here";
stringstream s (temp);
while(s>> temp)
{
    cout << "TOKEN " << counter << " =" << temp <<endl;
    args[counter]=const_cast<char *> (temp.c_str());
    counter++;
}
//print the debug info
      for( int ii=0; args[ii] != NULL; ii++ )
    {
      cout << "Argument OUT " << ii << ": " << args[ii] << endl;
    }

This code doesn't work and I cant grasp why. The result stores "here" in every value of args but counter gets changed.

TOKEN 0 =some
TOKEN 1 =text
TOKEN 2 =and
TOKEN 3 =stuff
TOKEN 4 =here
Argument OUT 0: here
Argument OUT 1: here
Argument OUT 2: here
Argument OUT 3: here
Argument OUT 4: here
Yu Hao
  • 119,891
  • 44
  • 235
  • 294
  • c_str returns a const pointer for a reason! – Neil Kirk Oct 07 '14 at 16:34
  • 1
    @NeilKirk The const-ness issue is unrelated to the pointers all pointing to the same place. Note that he needs non-const pointers because this is what the `exec*()` family of functions requires, even though they do not modify the passed strings, so `const_cast` is safe here. It's the reusing of the same `std::string` object that is causing the problem. – cdhowie Oct 07 '14 at 16:38

3 Answers3

3

When you do this:

args[counter]=const_cast<char *> (temp.c_str());

You are not copying the string, only storing a pointer to its content. So of course they all point to the same temp string, which makes the values all the same when you print them.

This would be a lot easier if you just used std::vector<std::string> for args.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
1

Probably because the temp object is re-using its internal allocation. When you store the c_str() result you are only storing a memory address. The std::string class does not create a completely new allocation each time you read into it from your string stream, rather it reuses the allocation it already has (if possible).

Further, using a pointer returned by c_str() after you have done anything else to the std::string object from which it was obtained invokes undefined behavior.1

If possible, just change args to std::vector<std::string>. If this is not possible then you need to strdup() the pointer returned by c_str() in order to create an entirely new allocation that copies the value of the string at that moment. You must, of course, remember to free() the allocations when you are done.

Additionally, casting away the const qualifier and writing to the pointer results in undefined behavior.2 At a minimum you need to change args to be const char * args[100];, but I would strongly suggest using a vector of strings instead.


1 http://en.cppreference.com/w/cpp/string/basic_string/c_str

The pointer obtained from c_str() may be invalidated by:

  • Passing a non-const reference to the string to any standard library function, or
  • Calling non-const member functions on the string, excluding operator[], at(), front(), back(), begin(), rbegin(), end() and rend().

2 http://en.cppreference.com/w/cpp/string/basic_string/c_str

Writing to the character array accessed through c_str() is undefined behavior.


Based on your comment indicating that you need to use exec(), it sounds like you do need an array of pointers-to-char. However, we can still do this using vectors. You need one vector to hold the std::string objects, which will own the char* allocations. Then you can use another vector to hold the actual pointers. Something like this:

const char * binaryPath = "/bin/foo";

std::vector<std::string> argStrings;
std::vector<char *> argPointers;

std::string temp = "some text and stuff here";
istringstream s(temp);

// argv[0] should always contain the binary's path.
argPointers.push_back(binaryPath);

while (s >> temp) {
    argStrings.push_back(temp);

    std::cout << "TOKEN " << argStrings.size()
              << " =" << argStrings.back() << std::endl;

    // We cast away the const as required by the exec() family of functions.
    argPointers.push_back(const_cast<char *>(argStrings.back().c_str()));
}

// exec() functions expect a NULL pointer to terminate the arguments array.
argPointers.push_back(nullptr);

// Now we can do our exec safely.
execv(binaryPath, &argPointers[0]);

In this case argStrings owns the actual string allocations, and we are using argPointers only to hold the array of pointers we will be passing to execv(). The const_cast is safe because execv() does not modify the strings. (The argument is char * const [] for compatibility with older C code; the functions behave as though the argument is const char * const [].)

cdhowie
  • 158,093
  • 24
  • 286
  • 300
  • I see a vector is the way to go now, thanks. If I change to a vector I will still have to update args pointers to the vector so i can pass to an exec system call. Would I want the args to be a const char * args[100] still since I will have to convert. (maybe something like char **args)? – Chase Collis Parker Oct 07 '14 at 12:35
  • @ChaseCollisParker Hmm, this is a bit trickier then since you need to match an existing API. This will depend on the particular exec call you are using (there are many, with varying argument types). If you would like we could discuss further on SO chat. – cdhowie Oct 07 '14 at 15:16
  • Thank you very much for the in depth explanation. This experience has been invaluable. I had not quite figured out a vector of pointers would work the same when passed to exec. The transition back to pointers (from java) is a mega pain, I will make sure to document better this time. Thanks again sir. – Chase Collis Parker Oct 07 '14 at 20:58
  • Why am I stating the binary path? can the exec call find it for me since this will be ran on another distro for testing I am not sure I can know it. – Chase Collis Parker Oct 07 '14 at 21:46
  • @ChaseCollisParker Yes, you will want to use `execvp()` instead, which will take a "bare" first argument, and search the `PATH` environment variable for an executable file with that name (similar to what a shell would do). – cdhowie Oct 07 '14 at 21:59
0

You need to store each string separately, storing a pointer to a temporary object is not the way to go. E.g.

#include <iostream>
#include <vector>
#include <sstream>

using namespace std;

void exec(char* args[])
{
    for (int i = 0; args[i] != NULL; ++i)
      cout << args[i] << endl;
}

int main() 
{
  string temp = "some text and stuff here";
  stringstream s (temp);    

  vector<string> tokens;
  while(s>> temp)
  {
    tokens.push_back(temp);
  }

  int counter = 0;
  char *args[100];
  for (auto it = tokens.begin(); it != tokens.end(); ++it)
    args[counter++] = const_cast<char*>(it->c_str());
  args[counter] = NULL;

  exec(args);

  return 0;
}

you can run it here

AndersK
  • 35,813
  • 6
  • 60
  • 86