0

I'm creating a program which recursively finds all #include dependencies between C files in a specified directory and its child directories. Dependency paths should be absolute, so I use realpath to resolve relative paths and symbolic links. Since there can be many files I have decided to make the program multithreaded with OpenMP or pthreads.

The problem is that realpath resolves paths through the working directory. All threads share the same working directory so I would need to put a mutex on chdir and realpath.

Is there any alternate standard function to realpath which also takes the directory to resolve the path from as an argument?

Klas. S
  • 650
  • 9
  • 21
  • 1
    Why not just have each thread maintain a string of their current path relative to your starting directory? You could append to it every time you descend into a subdirectory, and remove a level when you come out. You usually only see relative paths anyway, like `#include "common/util.h"`. They have to be resolved by the include paths `-I` that you give to the compiler. Without that information, I don't see how you're correctly resolving full paths anyway. – e0k Dec 18 '16 at 00:13
  • You are right, that is probably the solution I will go with. It doesn't solve symlinks, but really who symlinks include files..? Thank you for bringing -I to mind, I'll fix that as well. – Klas. S Dec 18 '16 at 00:30
  • Lots of people symlink headers. Or, at least, some the bigger projects I've worked on have headers symlinked for various different reasons (many of them not very good, but that's what happens over 30 years). – Jonathan Leffler Dec 18 '16 at 03:05

2 Answers2

2

There are a number of POSIX functions with the at suffix (such as openat()) which work with a specified directory. There isn't, however, a realpathat() function in POSIX. There also isn't an opendirat(), but there is fdopendir() which creates a DIR stream for an open directory file descriptor.

In a multithreaded program, any use of chdir() is fraught.

You should rethink your algorithm to use the various *at() functions to avoid needing to change directory at all. You'd open the directories for reading (open() or openat() with O_DIRECTORY, perhaps — though O_DIRECTORY isn't 100% necessary, nor is it supported on macOS) so that you can then access the files appropriately using the directory file descriptor in the *at() calls.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
0

I worked a bit on a solution. It is by no means optimal but at least it seems to work. I created the function abspathat which turns a relative path into an absolute path. Then I use the built in readlinkat to fix the symlinks. The solution handles turns paths like "../code.c" "./code.c" "code.c" into "/dir/code.c". However it does currently not fix paths such as ../dir/../code.c, though why would anyone create such a path. Nor does it check if the file actually exists. Feel free to improve or do whatever you like with this code.

#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <dirent.h>
#include <stdio.h>
/*****************************************************************************/
char *abspathat(char *dirpath, int dirlen, char *path);
/*****************************************************************************/
static const int MAX_FILEPATH = 4096;
/*****************************************************************************/
char *realpathat(int dirfd, char *dirpath, int dirlen, char *path) {
    char *abs = abspathat(dirpath, dirlen, path);
    char *buf = malloc(sizeof(char)*MAX_FILEPATH);
    ssize_t size = readlinkat(dirfd, abs, buf, MAX_FILEPATH);
    char *realpath;
    if(size != -1) {
        realpath = malloc(sizeof(size+1));
        memcpy(realpath, buf, size);
        realpath[size] = '\0';
        free(abs);
    } else {
        realpath = abs;
    }
    free(buf);
    return realpath;
}
/*---------------------------------------------------------------------------*/
char *abspathat(char *dirpath, int dirlen, char *path) {
    /* If absolute */
    if(path[0] == '/') {
        return path;
    }
    int i;
    char *right;
    int d = 0;
    int rlen = strlen(path);
    int llen = 0;
    if(path[0] == '.') {
        if(path[1] == '.' && path[2] == '/') {
            for(i = 3, d = 1; path[i] == '.'
                    && path[i+1] == '.'
                    && path[i+2] == '/'
                    && i < rlen; i+=3) {
                d++;
            }
            right = &path[i];
            rlen -= i;
        } else if(path[1] == '/') {
            right = &path[2];
            rlen -= 2;
        }
    } else {
        right = &path[0];
    }
    for(i = dirlen - 1 - (dirpath[dirlen-1] == '/'); d && i; i--) {
        if(dirpath[i] == '/') {
            d--;
        }
    }
    llen = i+1;
    char *cpy = malloc(sizeof(char)*(llen + rlen + 2));
    memcpy(cpy, dirpath, llen);
    cpy[llen] = '/';
    memcpy(cpy+llen+1, right, rlen);
    cpy[llen+rlen+1] = '\0';
    return cpy;
}
/*---------------------------------------------------------------------------*/
int main(int argc, char *argv[]) {
    if(argc == 3) {
        char *dirpath = argv[1];
        DIR *d = opendir(dirpath);
        char *path = argv[2];
        char *resolved = realpathat(dirfd(d), dirpath, strlen(dirpath), path);
        printf("%s\n", resolved);
    } else {
        printf("realpathat [directory] [filepath]\n");
    }
    return 0;
}
Klas. S
  • 650
  • 9
  • 21