Determine which binary will run via execlp in advance

Question

Edit #1

The "Possible duplicates" so far are not duplicates. They test for the existence of $FILE in $PATH, rather than providing the full path to the first valid result; and the top answer uses bash command line commands, not pure c.

Original Question

Of all the exec family functions, there are a few which do $PATH lookups rather than requiring an absolute path to the binary to execute.

From man exec:

The execlp(), execvp(), and execvpe() functions duplicate the actions of the shell in searching for an executable file if the specified filename does not contain a slash (/) character. The file is sought in the colon-separated list of directory pathnames specified in the PATH environment variable. If this variable isn't defined, the path list defaults to the current directory followed by the list of directories returned by confstr(_CS_PATH). (This confstr(3) call typically returns the value "/bin:/usr/bin".)

Is there a simple, straightforward way, to test what the first "full path to execute" will evaluate to, without having to manually iterate through all the elements in the $PATH environment variable, and appending the binary name to the end of the path? I would like to use a "de facto standard" approach to estimating the binary to be run, rather than re-writing a task that has likely already been implemented several times over in the past.

I realize that this won't be a guarantee, since someone could potentially invalidate this check via a buggy script, TOCTOU attacks, etc. I just need a decent approximation for testing purposes.

Thank you.

Possible duplicate of [C, runtime test if executable exists in PATH](http://stackoverflow.com/questions/8035372/c-runtime-test-if-executable-exists-in-path) — kaylum, Jan 27 '17 at 05:51
Why do you ask? What is the actual use case? Please **edit your question** to motivate it. — Basile Starynkevitch, Jan 27 '17 at 06:27
@BasileStarynkevitch I wish to know what program will (likely) be executed in advance for testing purposes. — Cloud, Jan 27 '17 at 06:44
The tests a security-sensitive application should (in my paranoid opinion) do, is first locate the actual binary to be executed (and later use the exact full path), and check the ownership of the file, directory, and all parent directories, twice. The idea is to verify that no untrusted user has write access to any of them; the second pass is to verify that nothing changed from the first pass (bait-and-switch). UID 0, GID 0 ("root") is always trusted, any other trusted accounts depend on the application. — Nominal Animal, Jan 27 '17 at 09:38

score 3 · Accepted Answer · edited May 23 '17 at 12:33

Is there a simple, straightforward way, to test what the first "full path to execute" will evaluate to, without having to manually iterate through all the elements in the $PATH environment variable

No, you need to iterate thru $PATH (i.e. getenv("PATH") in C code). Some (non standard) libraries provide a way to do that, but it is really so simple that you should not bother. You could use strchr(3) to find the "next" occurrence of colon :, so coding that loop is really simple. As Jonathan Leffler commented, they are subtleties (e.g. permissions, hanging symbolic links, some other process adding some new executable to a directory mentionned in your $PATH) but most programs ignore them.

And what is really relevant is the PATH value before running execvp. In practice, it is the value of PATH when starting your program (because outside processes cannot change it). You just need to be sure that your program don't change PATH which is very likely (the corner case, and difficult one, would be some other thread -of the same process- changing the PATH environment variable with putenv(3) or setenv(3)).

In practice the PATH won't change (unless you have some code explicitly changing it). Even if you use proprietary libraries and don't have time to check their source code, you can expect PATH to stay the same in practice during execution of your process.

If you need some more precise thing, and assuming you use execp functions on program names which are compile time constants, or at least constant after your program initialization reading some configuration files, you could do what many shells are doing: "caching" the result of searching the PATH into some hash table, and using execve on that. Still, you cannot avoid the issue of some other process adding or removing files into directories mentioned in your PATH; but most programs don't care (and are written with the implicit hypothesis that this don't happen, or is notified to your program: look at the rehash builtin of zsh as an example).

But you always need to test against failure of exec (including execlp(3) & execve(2)) and fork functions. They could fail for many reasons, even if the PATH has not changed and directories and files mentioned in it have not been changed.

There are tricky bits to parsing a PATH value. Formally, if the first character is a colon, or the last character is a colon, or there are two adjacent colons in the PATH, then the directory name is (implicitly) '`.`' (dot, the current directory). There are other complicating factors too: checking the permissions on the file, ACLs (which might hide that a user or group has permission when in fact they do), maybe checking permissions on the directory, though if you can check permissions on the file, you're usually OK. — Jonathan Leffler, Jan 27 '17 at 06:21
@JonathanLeffler: in principle you are perfectly right. In practice, one often don't care. — Basile Starynkevitch, Jan 27 '17 at 06:24
Thank you. This is far more informative than the "possible duplicate" linked to this question. — Cloud, Jan 27 '17 at 06:27

Determine which binary will run via execlp in advance

Edit #1

Original Question

1 Answers1