1

In this example, I'm using Python 3.6.5 installed using pyenv in an OSX shell.

I've been toying around with some proof of concept file watching code and I figured using a delta of a file's current and last measured st_mtime would be enough to "detect" that a file has changed.

The code:

import os


def main():
    file_path = 'myfile.txt'
    last_modified = os.stat(file_path).st_mtime
    while True:
        check_last_modified = os.stat(file_path).st_mtime
        delta = check_last_modified - last_modified

        if delta != 0.0:
            print("File was modified.")

        last_modified = check_last_modified



if __name__ == '__main__':
    main()

The weird thing is different types of basic file modification operations will result in "File was modified." printing more than once.

Assuming myfile.txt exists, I get a different number of prints based on the operation:

It prints 1 time with: $ touch myfile.txt

It prints 2 times with: $ echo "" > myfile.txt.

It prints 1 time with:

$ cat <<EOF > myfile.txt
> EOF

It prints 2 times with (empty line):

$ cat <<EOF > myfile.txt
>
> EOF

It prints 1 time using python to write an empty string:

def main():
    with open('myfile.txt', 'w') as _file:
        _file.write('')

if __name__ == '__main__':
    main()

It prints 2 times using python to write a non-empty string:

def main():
    with open('myfile.txt', 'w') as _file:
        _file.write('a')

if __name__ == '__main__':
    main()

The biggest difference seems to be the presence of a string other than a newline, but seeing as how the echo command results in two prints I'm not inclined to believe it's bound to that in any way.

Any ideas?

Alex Milstead
  • 83
  • 1
  • 8

1 Answers1

2

Your loop is a busy waiting loop so it can catch several time changes very quickly.

When python creates the file (open) it sets/updates the creation time.

But the creation time is updated once more when closing the file. Which explains you catch 2 time updates.

touch just sets the modification time once, but echo acts the same as your python script: set modification time when creating/opening the existing file, and set it again when closing it.

The busy loop and the open/close operations create a race conditions and the number of time updates you're seeing is undefined (which explains that your script misses one update in a cat command where the data is small)

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219