4

I mean to ask about the preservation of string literals within a call to execve family. From OSTEP Chp.4:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>

int main (int argc, char *argv[]) {
  ...
  int rc = fork();
  ...
  if (rc == 0) {
    char *myargs[3];
    myargs[0] = strdup("wc");
    myargs[1] = strdup("p3.c");
    myargs[2] = NULL;
    execvp(myargs[0], myargs);
  }

I denote ... as irrelevant to my question. I wanted to ask about the code here. As I understand it, C sets up myargs to be presented as the char *argv[] of the main function of wc. I want to understand the concept to do with string literals. From my understanding, the C standard allows the mutability of the argv. For example:

int main (int argc, char *argv[]) {
  argv[0][2] = 'd';
}

However, I'm wondering how this can be guaranteed when these character arrays are fixed length. Additionally, why was strdup() called? From my understanding, this will heap-allocate the string, allowing it to be mutated, so is this necessary when calling execvp() or would:

  myargs[0] = "wc";

be acceptable?

user129393192
  • 797
  • 1
  • 8
  • A call to `execvp` will wipe the heap, so it should make no difference. It’s likely guaranteed by some other method (though I cannot say) – user8393252 Apr 15 '23 at 23:50

1 Answers1

3

From OSTEP Chp.4:

The chapter in that link is labeled 5, not 4.

However, I'm wondering how this can be guaranteed when these character arrays are fixed length.

The C standard says the program may modify the contents of the strings pointed to by argv. A string is a sequence of characters terminated by (and including) a null character. Therefore, the program may modify those characters. Thus, it is only guaranteed it can use the bytes that are in the string when the program starts. It is not guaranteed it can make the strings longer or use any bytes beyond the null character.

Additionally, why was strdup() called?

Because the author did not know or neglected the fact that it is not necessary.

is this necessary when calling execvp()…?

No, execvp will take steps to ensure the strings passed to it are made available to the new program. The program calling execvp does not need to make copies.

I mean to ask about the preservation of string literals within a call to execve family.

A string literal is a piece of source code. It is not a string in the resulting program. A string literal is a sequence of characters inside quote marks, optionally with a prefix u8, u, U, or L, as in "abc" or u8"def". The string literal is literally those characters in the source code, "abc", including the quote marks. It is not the characters “abc” in the program that is created. When a program is compiled, the compiler creates the string that is represented by the string literal (in the computing model used by the C standard). The resulting string is merely a string; there is nothing particularly special about it compared to a string that did not come from a string literal, except the C standard does not guarantee it can be modified. Some people call the resulting strings string literals, but this is a slightly sloppy use of language.

Whether a string passed to execvp came from a string literal or was constructed by the program in some way or was read from input or has automatic, static, or dynamically allocated storage duration is irrelevant to execvp.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • I had thought that a string literal is an address location? And that the contents are read-only? Which is why people specify it that way, to differentiate from a string that is a `char []` and modifiable. – user8393252 Apr 16 '23 at 01:58
  • 2
    @user8393252: Re “I had thought that a string literal is an address location?”: No. A string literal is a piece of source code, as this answer explains. – Eric Postpischil Apr 16 '23 at 02:18
  • @user8393252: Re “And that the contents are read-only?”: This is entirely irrelevant because `execvp` does not pass the same addresses it receives to the new program (except by mere coincidence). `execvp` creates an entirely new program environment, replacing the old program. It does whatever work it needs for that, including copying strings. – Eric Postpischil Apr 16 '23 at 02:20
  • I am a bit confused by your response. Does something like `variable == “hello”` not treat `”hello”` as it’s address location? And is `char *p = “hello”; p[1]=‘o’;` not undefined behavior? I was not asking in the context of this question, but in general regarding what you said about string literals. – user8393252 Apr 16 '23 at 02:26
  • Re the above, it wouldn’t really make sense to set `char*` to be `”hello”` if the latter were not an address, or am I misunderstanding something? – user8393252 Apr 16 '23 at 02:39