0

I am trying to traverse a directory structure and open every file in that structure. To traverse, I am using opendir() and readdir(). Since I already have the entity, it seems stupid to build a path and open the file -- that presumably forces Linux to find the directory and file I just traversed.

Level 2 I/O (open, creat, read, write) require a path. Is there any call to either open a filename inside a directory, or open a file given an inode?

Dov
  • 8,000
  • 8
  • 46
  • 75
  • "that presumably forces Linux to find the directory and file I just traversed." - I'd do some digging/measuring to see if this is worth trying to optimize. Your presumption here may not be correct. – Joe Jul 04 '14 at 13:26
  • If nothing else, it forces ME to build the path when I don't want to! – Dov Jul 04 '14 at 13:26
  • 2
    There is the (rather new) openat() system call. However there's not much you get from opendir()/readdir() to pass into the openat() call. – nos Jul 04 '14 at 14:04

1 Answers1

2

You probably should use nftw(3) to recursively traverse a file tree.

Otherwise, in a portable way, construct your directory + filename path using e.g.

snprintf(pathbuf, sizeof(pathbuf), "%s/%s", dirname, filename);

(or perhaps using asprintf(3) but don't forget to later free the result)

And to answer your question about opening a file in a directory, you could use the Linux or POSIX2008 specific openat(2). But I believe that you should really use nftw or construct your path like suggested above. Read also about O_PATH and O_TMPFILE in open(2).

BTW, the kernel has to access several times the directory (actually, the metadata is cached by file system kernel code), just because another process could have written inside it while you are traversing it.

Don't even think of opening a file thru its inode number: this will violate several file system abstractions! (but might be hardly possible by insane and disgusting tricks, e.g. debugfs - and this could probably harm very strongly your filesystem!!).

Remember that files are generally inodes, and can have zero (a process did open then unlink(2) a file while keeping the open file descriptor), one (this is the usual case), or several (e.g. /foo/bar1 and /gee/bar2 could be hard-linked using link(2) ....) file names.

Some file systems (e.g. FAT ...) don't have real inodes. The kernel fakes something in that case.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • openat is exactly what I was looking for, but why is constantly reconstructing a string path better? – Dov Jul 07 '14 at 10:20
  • Because `openat` might not be working or available on every kernel or filesystem. Constructing the string is much faster -so is negligible- than invoking filesystem syscalls (which at least go to the file system cache, and perhaps even to the disk). – Basile Starynkevitch Jul 07 '14 at 10:44