How does execvp run a command?

Question

I know execvp can be used to execute simple commands as follows:

char* arg[] = {"ls", "-l", NULL};
execvp(arg[0],arg);

I want to know what goes on in here when I run execvp. In man page it says execvp replaces the image of the process image with the new one. However here I am running a command not an executable.

To be specific, say there is a command that specifically requires an input e.g. cat. If I have a text file text.txt which contains the file name expected for cat and I redirect stdin to the file stream of the file, would the output of execle("cat","cat",NULL) or execvp("cat", arg) (obviously where arg stores "cat" and NULL) result in the output in the console as the cat /filename would? My intuition is I have to read the file and may be parse it to store the arguments in the arg. However I want to make sure.

Thanks in advance!

Type `which ls` into your terminal to see the location for the `ls` program. You are, indeed, running an executable when you say `ls`. — Cornstalks, Jan 13 '13 at 07:51
@Alex: With the exception of a few special commands which are implemented in the shell, like `cd`, `source`, and `exit`. — , Jan 13 '13 at 08:06

Michael Foukarakis · Accepted Answer · 2015-09-10T08:16:27.660

Here's what happens in an execvp call:

Your libc implementation searches in PATH, if applicable, for the file that is to be executed. Most, if not all, commands in UNIX-like systems are executables. What will happen if it is not? Try it. Have a look at how glibc does it.
Typically, if the executable is found, a call to execve will be made. Parts of execve may be implemented in libc or it may be a system call (like in Linux).
Linux prepares a program by allocating memory for it, opening it, scheduling it for execution, initialises memory structures, sets up its arguments and environment from the supplied arguments to the execvp call, finds a handler appropriate for loading the binary, and sets the current task (the execvp caller) as not executing. You can find its implementation here.

All steps above conform to the requirements set by POSIX which are described in the relevant manual pages.

Awesome explanation!. A minor doubt - Does it mean the new image loaded will have the same CR3 value as the old one? — Sandhya Kumar, Sep 10 '15 at 08:11
No. Each process has its own CR3, which is the responsibility of the code that sets up its page tables. — Michael Foukarakis, Sep 11 '15 at 07:20

SaveTheRbtz · Answer 2 · 2014-07-23T06:52:27.227

Regarding your questions:

In man page it says execvp replaces the image of the process image with the new one. However here I am running a command not an executable.

Long-long time ago shell was very limited and almost all UNIX commands was standalone executables. Now, mostly for speed purposes some subset of UNIX commands is implemented inside shell itself, those commands are called builtins. You can check whatever command is implemented in your shell as built-in or not via type command:

λ ~/ type echo
echo is a shell builtin

(Full list of builtins with descriptions can be found in man pages to your shell e.g. man bash-builtins or man builtin.)

But still most of the commands still have their executable-counterpart:

λ ~/ whereis echo
/bin/echo

So in your specific case when you are running:

char* arg[] = {"ls", "-l", NULL};
execvp(arg[0],arg);

You are actually replacing address space of current process with address space of (most likely) /bin/ls.

My intuition is I have to read the file and may be parse it to store the arguments in the arg.

Indeed you you have. But you also may use some in-kernel functions for that aka "shebang":
Instead of putting file name in separate file add so-called shebang as the first line of the file you want to cat:

#!/bin/cat

And add chmod +x to it. Then you can run it as executable (via any of exec functions or shell):

λ ~/tmp/ printf '#!/bin/cat\nTEST\n' > cat_me
λ ~/tmp/ chmod +x cat_me
λ ~/tmp/ ./cat_me 
#!/bin/cat
TEST

Of cause it's has a drawback of printing shebang itself with file but still it's fun to do it in-kernel =)

BTW. Problem that you described if so common that there is a special executable called xargs which (in very simplified explanation) executes given program on list of arguments passed via stdin. For more information consult with man xargs.

For easy memorization of exec-family I often use following table:

           Figure 8.14. Differences among the six exec functions
+----------+----------+----------+----------+--------+---------+--------+
| Function | pathname | filename | agr list | argv[] | environ | envp[] |
+----------+----------+----------+----------+--------+---------+--------+
|  execl   |    *     |          |     *    |        |    *    |        |
+----------+----------+----------+----------+--------+---------+--------+
|  execlp  |          |    *     |     *    |        |    *    |        |
+----------+----------+----------+----------+--------+---------+--------+
|  execle  |    *     |          |     *    |        |         |   *    |
+----------+----------+----------+----------+--------+---------+--------+
|  execv   |    *     |          |          |    *   |    *    |        |
+----------+----------+----------+----------+--------+---------+--------+
|  execvp  |          |    *     |          |    *   |    *    |        |
+----------+----------+----------+----------+--------+---------+--------+
|  execve  |    *     |          |          |    *   |         |   *    |
+----------+----------+----------+----------+--------+---------+--------+
|  letter  |          |    p     |     l    |    v   |         |   e    |
+----------+----------+----------+----------+--------+---------+--------+

So in your case execvp takes filename, argv(v) and environ(e). Then it's tries to "guess" pathname (aka full path) by appending filename (in your case cat) to each path component in PATH until it find path with executable filename.

Much more information about whats going on under the exec's hood (including inheritance stuff) can be found in Advanced Programming in the UNIX Environment (2nd Edition) by W. Richard Stevens and Stephen A. Rago aka APUE2.
If you are interested in UNIX internals you should probably read it.

score 2 · Answer 3 · answered Jan 13 '13 at 07:21

2

"ls" isn't just a command, it's actually a program (most commands are). When you run execvp like that, it will nuke your entire program, its memory, its stack, its heap, etc... conceptually "clear it out" and give it to "ls" so it can use it for its own stack, heap, etc.

In short, execvp will destroy your program, and replace it with another program, in this case "ls".

answered Jan 13 '13 at 07:21

Verdagon

2,456
3
22
36

Not sure... looks like someone just upvoted it back to 0 though lol. – Verdagon Jan 13 '13 at 07:23

score 1 · Answer 4 · answered Jan 13 '13 at 08:05

My intuition is I have to read the file and may be parse it to store the arguments in the arg. However I want to make sure.

Your intuition is largely correct. The cat utility that you're using as an example has two separate code paths:

If there are filenames specified as arguments, it will open and read each one in turn.
If there are no filenames specified, it will read from standard input.

This behavior is specifically implemented in the cat utility -- it is not implemented at any lower level. In particular, it is definitely not part of the exec system call. The exec system calls do not "look at" arguments at all; they just pass them straight on to the new process in argv, and that process gets to handle them however it sees fit.

How does execvp run a command?

4 Answers4

Linked