Regarding your questions:
In man page it says execvp
replaces the image of the process image
with the new one. However here I am running a command not an
executable.
Long-long time ago shell was very limited and almost all UNIX commands was standalone executables. Now, mostly for speed purposes some subset of UNIX commands is implemented inside shell itself, those commands are called builtins
. You can check whatever command is implemented in your shell as built-in or not via type
command:
λ ~/ type echo
echo is a shell builtin
(Full list of builtins with descriptions can be found in man
pages to your shell e.g. man bash-builtins
or man builtin
.)
But still most of the commands still have their executable-counterpart:
λ ~/ whereis echo
/bin/echo
So in your specific case when you are running:
char* arg[] = {"ls", "-l", NULL};
execvp(arg[0],arg);
You are actually replacing address space of current process with address space of (most likely) /bin/ls
.
My intuition is I have to read the file and may be parse it to store
the arguments in the arg.
Indeed you you have. But you also may use some in-kernel functions for that aka "shebang":
Instead of putting file name in separate file add so-called shebang as the first line of the file you want to cat:
#!/bin/cat
And add chmod +x
to it. Then you can run it as executable (via any of exec
functions or shell):
λ ~/tmp/ printf '#!/bin/cat\nTEST\n' > cat_me
λ ~/tmp/ chmod +x cat_me
λ ~/tmp/ ./cat_me
#!/bin/cat
TEST
Of cause it's has a drawback of printing shebang
itself with file but still it's fun to do it in-kernel =)
BTW. Problem that you described if so common that there is a special executable called xargs
which (in very simplified explanation) executes given program on list of arguments passed via stdin. For more information consult with man xargs
.
For easy memorization of exec
-family I often use following table:
Figure 8.14. Differences among the six exec functions
+----------+----------+----------+----------+--------+---------+--------+
| Function | pathname | filename | agr list | argv[] | environ | envp[] |
+----------+----------+----------+----------+--------+---------+--------+
| execl | * | | * | | * | |
+----------+----------+----------+----------+--------+---------+--------+
| execlp | | * | * | | * | |
+----------+----------+----------+----------+--------+---------+--------+
| execle | * | | * | | | * |
+----------+----------+----------+----------+--------+---------+--------+
| execv | * | | | * | * | |
+----------+----------+----------+----------+--------+---------+--------+
| execvp | | * | | * | * | |
+----------+----------+----------+----------+--------+---------+--------+
| execve | * | | | * | | * |
+----------+----------+----------+----------+--------+---------+--------+
| letter | | p | l | v | | e |
+----------+----------+----------+----------+--------+---------+--------+
So in your case execvp
takes filename, argv(v) and environ(e).
Then it's tries to "guess" pathname (aka full path) by appending filename
(in your case cat
) to each path component in PATH
until it find path with executable filename
.
Much more information about whats going on under the exec
's hood (including inheritance stuff) can be found in Advanced Programming in the UNIX Environment (2nd Edition) by W. Richard Stevens and Stephen A. Rago aka APUE2.
If you are interested in UNIX internals you should probably read it.