3

I'm writing a toy bash shell. My goal right now is to cycle through the environment looking for the path a specific command can be found at. Right now I am delimiting the PATH (e.g. "/home/user/bin:home/user/.local/bin:/usr/local/sbin" etc) by ":", and for each path that gives me, copying that path to the new string finalPath, and then concatenating "/cmd" to the end.

My problem is that when I try to copy the contents of path into finalPath, any changes I make to finalPath are then reflected onto path. As the code is right now, path will only be set to "home/user/bin" once, cycle through and be set to the same thing again, then the tokenizer hits "NULL" and terminates the while loop.

This suggests that path and finalPath are sharing a memory address, but since strcpy theoretically makes a new copy in memory, I must be doing something wrong with my strings and pointers.

Any idea what's causing this unexpected behavior?

Edit: this code executes just as expected when I comment out strcpy

A stripped-down version of my code is below:

int findpath(char* cmd, command_t* p_cmd) {
    char* path_var;

    path_var = getenv( "PATH" );

    char* path;
    char tempEnv[sizeof(path_var)];
    strcpy(tempEnv, path_var);
    path = strtok(tempEnv, ":");

    while(path != NULL) {
        char fullPath[1000];
        strcpy(finalPath, path);
        printf("path: %s\n", path);
        printf("finalPath: %s\n", finalPath);
        path = strtok(NULL, ":");
    }
teleTele
  • 33
  • 3
  • 3
    Yes, `strtok` does indeed change the source string. That's documented bahavior. – alain Sep 25 '16 at 19:31
  • 3
    `sizeof(path_var)` is not length of `path_var`. It's pointer size. – BLUEPIXY Sep 25 '16 at 19:32
  • Strtok changing the source string is fine, it's strcpy I'm confused about. Edit: some clarification is that this code cycles through just fine when I comment out the strcpy – teleTele Sep 25 '16 at 19:34
  • `finalPath` is not declared anywhere in your code – M.M Sep 25 '16 at 21:41

1 Answers1

3

BLUEPIXY is right: the tempEnv isn't big enough for your string. Try:

char *tempEnv;
tempEnv = malloc(strlen(path_var)+1);
strcpy(tempEnv, path_var);

and at the end

free(tempEnv);

With the proviso that this is full of holes. You should be using safer string functions, e.g., as described here. For example, use strnlen to enforce set some reasonable limit on the length of path_var. Make sure path_var is NULL-terminated within that limit. Use strncpy instead of strcpy. Add the NULL after the strncpy if necessary. And a host of other rules, which I am not including here since your goal appears to be learning rather than production code. Happy hacking!

cxw
  • 16,685
  • 2
  • 45
  • 81
  • Under what circumstances do you think `path_var` might not be null terminated? How could the program tell? I don't find that caveat remotely compelling — the output from `getenv()` is a null-terminated string. Using `strncpy()` is a mixed bag — it doesn't guarantee null termination when the source is longer than the target, and it guarantees null padding to full length when the source is shorter than the target. 'Tis a curious function; 'tis not a universal answer for safety concerns (though it can be used safely if you're careful). – Jonathan Leffler Sep 25 '16 at 20:10
  • 1
    `char tempEnv[strlen(path_var)+1];` is also fine – M.M Sep 25 '16 at 21:42
  • @JonathanLeffler Edited to clarify. What I was trying to get at was that `while(*c++) i++;` is a bad idea when `*c` is not under the program's control. In any event, I agree this answer is incomplete with respect to safe string practices --- I leave that to folks who have more experience than I do. :) – cxw Sep 26 '16 at 14:35
  • As shorter version: `tempEnv = strdup(path_var);`. There's also `strndup()`. – kfx Sep 26 '16 at 14:35
  • Definitely better. There is, in practice, a relatively small upper-limit on the length of the entire environment (and the command line arguments), sometimes as high as 512 KiB but normally smaller than that. Which means that there isn't all that much danger from an overlong PATH as a part of the environment. However, the advice might well apply to other, less constrained sources of long strings. It's one problem with the POSIX [`getline()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/getline.html), for example — you can't specify an upper-bound on the length of a line. – Jonathan Leffler Sep 26 '16 at 14:42
  • This answer recommends `strncpy()`, and links to an article which recommends *against* `strncpy()`. There is rarely a good reason to use `strncpy()`, it's not a "safe" version of `strcpy()` it's just a different function. The recommendation to use `strnlen()` is somewhat silly too. These are environment variables, not pointers to random memory buffers. The operating system guarantees that they are NUL-terminated and are not too long (Linux seems to limit them to 128 KiB each) – Dietrich Epp Sep 26 '16 at 15:34
  • @DietrichEpp All valid points. I am definitely not an expert on safe string processing --- I only know that it's a topic that should not be overlooked. I also know that treating OS-provided guarantees with a certain amount of paranoia can be useful :) . (e.g., the recent [SFV brouhaha](http://www.theregister.co.uk/2016/09/23/capcom_street_fighter_v/).) Please send me some better links and I will cheerfully edit the answer to incorporate them! – cxw Sep 26 '16 at 17:14
  • If your program is sensitive, for example, if it is expected to run setuid, then environment variables are definitely a concern—especially `PATH`, which should generally be wiped out by setuid programs. However, it is insanely paranoid to consider the possibility that they are not nul-terminated. This would require such a fundamental violation of the kernel's interface that you might as well give up. If environment variables aren't nul-terminated, programs would crash left and right, the kernel is broken. The SFV thing is not really related, that's a rootkit installed by a third-party. – Dietrich Epp Sep 26 '16 at 17:58