0

I am working on a fairly simple program that should print out a file tree given a path. The program worked with just straight recursion but when I modified it to fork a new child process to iterate over each new directory I started getting some weird output. It appears that processes are being allowed to move backwards to their parent directory.

I have been testing it on a small sample folder called test. Test contains 2 files (testFile1 and testFile2) and a folder (testSub). The folder testSub only contains 2 files (testSubFile1 and testSubFile2). As far as I can tell, the child process that is supposed to iterate through the testSub folder is doing that and then moving up a directory to the test folder and iterating through that.

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <dirent.h>
#include <string.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <unistd.h>

int recFork(char *path){
    /*int key = rand() % 113;
    printf("--------\n--------\n");
    printf("New function\n");
    printf("Key: %d\n", key);
    printf("path: %s\npid: %d\n", path, getpid());
    printf("--------\n--------\n");*/
    int status;
    pid_t pID = 1;
    char name[1024];
    struct stat statbuf;
    if(stat(path, &statbuf) == -1)
        return -1;

    /* if the item is a file */
    if(S_ISREG(statbuf.st_mode)){
        printf("pID: %d   ", getpid());
        printf("%s\t%8ld\n", path, statbuf.st_size);
    }

    /* if the item is a directory */
    else if((statbuf.st_mode & S_IFMT) == S_IFDIR){
        pID = fork();

       if(pID > 0){    //parent
           //printf("Forked child with pID: %d\n", pID);
           waitpid(pID, &status, WUNTRACED);
           //printf("Killed: %d\n", pID);
       }
       else if(pID == 0){   //child
           //printf("Child: %d\n", getpid());
           DIR *dir;
           struct dirent *dp = NULL;
           if ((dir = opendir(path)) == NULL){
               printf("Cannot open %s\n", path);
               exit(EXIT_FAILURE);
           }
           else{
               printf("DIR: %s/\n", path);
               while((dp = readdir(dir)) != NULL){
                   //printf("pID: %d key: %d dp = %s\n", getpid(), key, dp->d_name);
                   if(strcmp(dp->d_name, ".") == 0 || strcmp(dp->d_name, "..") == 0)
                       continue;
                   sprintf(name, "%s/%s", path, dp->d_name);
                   //printf("Process: %d  Key: %d\n", getpid(), key);
                   //printf("I was passed: %s\n", path);
                   //printf("Calling recFork(%s)\n\n", name);
                   recFork(name);
               }
               closedir(dir);
           }
       }
       else{   //failed to fork
           printf("Failed to fork\n");
           exit(EXIT_FAILURE);
       }
    }

    //printf("Returning from : %d with key: %d\n", getpid(), key);

    return 0;
}

The code includes quite a few commented out printf statements that I was using to try and debug what was going on. Honestly I'm just at a loss and I don't know who else to ask. I really would like to learn what I am doing wrong so if someone can point me in the right direction that would be much appreciated.

dragosht
  • 3,237
  • 2
  • 23
  • 32
bvallerand
  • 49
  • 1
  • 10
  • Why do you need to `fork`? On Linux, consider [nftw(3)](http://man7.org/linux/man-pages/man3/nftw.3.html) – Basile Starynkevitch Oct 22 '14 at 14:33
  • I know forking is not necessary and I have it working without forking but I wanted to know how it would work if I used forking and it started producing weird results. I added in the printf statements to show the pID at different points as well as the semi random "key" generated each time the function is called to try and differentiate between the different processes and the recursive calls within the process. – bvallerand Oct 22 '14 at 16:30

2 Answers2

2

What's probably happening is readdir() reading files in the order that they come out of the filesystem, which has no relation to how ls displays things. readdir() is raw/unsorted output, while ls by default sorts everything by name.

e.g. your on-disk file structure may be something like this:

./test/.
./test/..
./test/testFile1
./test/testSub/.
./test/testSub/..
./test/testFile2

Since readdir is returning everything in the order it's stored in the directory listing, you'll get ., then .., work on testFile, recursive into testSub, then go back UP to the main test dir and work on testFile2.

And note that you're not checking for type-of-file. You'll be attempting opendir() calls on the actual files, which will fail, and cause that particular process to exit(). You should be checking filetypes as returned by stat and only doing the recursive calls when it's actually a directory.

Marc B
  • 356,200
  • 43
  • 426
  • 500
  • I thought I was ignoring . and .. with this statement `if(strcmp(dp->d_name, ".") == 0 || strcmp(dp->d_name, "..") == 0) continue;` and checking for file vs directory with `if(S_ISREG(statbuf.st_mode))...` and `else if((statbuf.st_mode & S_IFMT) == S_IFDIR){...` – bvallerand Oct 22 '14 at 14:53
  • Woops right... Been a long while since I deal with C and I'd never used the posix macros at all. But the bits about the randomish ordering hold - if you're going "back up", it's because that's the order things are being presented to your code by the underlying OS code. – Marc B Oct 22 '14 at 16:59
0

I figured it out!

The issue was because a child was forked x number of levels deep in recursive statements and though I know each child is identical to its parent I didn't think that the child would be x levels deep in its own recursive pile.

Before a child would return it had to cycle back out of all of the recursive calls. I solved the issue by adding the statement exit(EXIT_SUCCESS); right after the closedir(dir); statement so instead of returning through all of the levels of recursion it would just exit once it finished its own directory.

Thanks everyone for your help along the way!

bvallerand
  • 49
  • 1
  • 10