using stat to detect whether a file exists (slow?)

Question

I'm using code like the following to check whether a file has been created before continuing, thing is the file is showing up in the file browser much before it is being detected by stat... is there a problem with doing this?

//... do something

struct stat buf;

while(stat("myfile.txt", &buf))
  sleep(1);

//... do something else

alternatively is there a better way to check whether a file exists?

What file browser? What is writing the file? Are you sure the file isn't being written under a slightly different name and then being renamed at the last moment? — Greg Hewgill, Jul 29 '10 at 13:28
I'm using konqueror, but dolphin also notifies my earlier than stat. the file is being written by an app I wrote, so I know what and where it should be written. also, the file is an empty file I'm writing just to signal that a process has completed. — james edge, Jul 29 '10 at 13:50
How long is this lag time you're referring to? Is it on the order of microseconds, or minutes? There should be no reason why `stat()` should fail to indicate the file exists when it really does. I suspect there is something else going on here that you haven't yet recognised. — Greg Hewgill, Jul 29 '10 at 13:53
this is what I'm thinking. its in the region of 10s to a minute or so. its certainly noticable. — james edge, Jul 29 '10 at 13:59
Are both your programs running on the same machine, or are you using NFS or other file server protocol? Are you certain that your `while` loop is actually running? (Have it print `"not yet"` or something every time it sleeps.) — Greg Hewgill, Jul 29 '10 at 14:05
Your problem is not `stat`, it's `sleep`. Basically any use of `sleep` is a bug. — R.. GitHub STOP HELPING ICE, Aug 03 '13 at 00:51

score 4 · Answer 1 · answered Jul 29 '10 at 13:37

4

Using inotify, you can arrange for the kernel to notify you when a change to the file system (such as a file creation) takes place. This may well be what your file browser is using to know about the file so quickly.

answered Jul 29 '10 at 13:37

Jerry Coffin

476,176
80
629
1,111

score 3 · Accepted Answer · 2010-07-29T19:14:53.443

The "stat" system call is collecting different information about the file, such as, for example, a number of hard links pointing to it or its "inode" number. You might want to look at the "access" system call which you can use to perform existence check only by specifying "F_OK" flag in "mode".

There is, however, a little problem with your code. It puts the process to sleep for a second every time it checks for file which doesn't exist yet. To avoid that, you have to use inotify API, as suggested by Jerry Coffin, in order to get notified by kernel when file you are waiting for is there. Keep in mind that inotify does not notify you if file is already there, so in fact you need to use both "access" and "inotify" to avoid a race condition when you started watching for a file just after it was created.

There is no better or faster way to check if file exists. If your file browser still shows the file slightly faster than this program detects it, then Greg Hewgill's idea about renaming is probably taking place.

Here is a C++ code example that sets up an inotify watch, checks if file already exists and waits for it otherwise:

#include <cstdio>
#include <cstring>
#include <string>

#include <unistd.h>
#include <sys/inotify.h>

int
main ()
{
    const std::string directory = "/tmp";
    const std::string filename = "test.txt";
    const std::string fullpath = directory + "/" + filename;

    int fd = inotify_init ();
    int watch = inotify_add_watch (fd, directory.c_str (),
                                   IN_MODIFY | IN_CREATE | IN_MOVED_TO);

    if (access (fullpath.c_str (), F_OK) == 0)
    {
        printf ("File %s exists.\n", fullpath.c_str ());
        return 0;
    }

    char buf [1024 * (sizeof (inotify_event) + 16)];
    ssize_t length;

    bool isCreated = false;

    while (!isCreated)
    {
        length = read (fd, buf, sizeof (buf));
        if (length < 0)
            break;
        inotify_event *event;
        for (size_t i = 0; i < static_cast<size_t> (length);
             i += sizeof (inotify_event) + event->len)
        {
            event = reinterpret_cast<inotify_event *> (&buf[i]);
            if (event->len > 0 && filename == event->name)
            {
                printf ("The file %s was created.\n", event->name);
                isCreated = true;
                break;
            }
        }
    }

    inotify_rm_watch (fd, watch);
    close (fd);
}

one question, is sleep(1) such a big problem... a couple of seconds lag is not an issue, and thats the most I'd expect from adding the sleep call. — james edge, Jul 29 '10 at 14:57
No, eliminating sleep is just a pursue of excellence rather than a problem solving if we are talking about 10 to 60 seconds delay here. Greg Hewgill asks right questions about NFS-like replication and different machines. Those things are the most probable causes. Also, there is a "waitpid" system call that you can use to wait for the process to finish instead of creating/polling a file. — , Jul 29 '10 at 15:13
This also has a race condition in the fact that there are no guarantees that the file exists after the notify OR access function calls. Only at the time of the calls. And in fact, testing to see if a file exists should probably be avoided. You should assume the file exists and handle the case if you cannot access it or lose it while working with it. Example 1: File exists but you don't have access. Example 2: File "goes away" because the mount disappears (NFS). — Rahly, Sep 15 '16 at 17:13

score 1 · Answer 3 · answered Jul 29 '10 at 13:38

1

your code will check if the file is there every second. you can use inotify to get an event instead.

answered Jul 29 '10 at 13:38

Omry Yadan

31,280
18
64
87

using stat to detect whether a file exists (slow?)

3 Answers3