I'm reading lines from a group of files (log files) following them as they are written using pyinotify.
I'm opening and reading the files with python native methods:
file = open(self.file_path, 'r')
# ... later
line = file.readline()
This is generally stable and can handle the file being deleted and re-created. pyinotify will notify the unlink and subsequent link.
However some log files are not being deleted. Instead they are being truncated and new content written to the beginning of the same file.
I'm having trouble reliably detecting when this has occurred since pyinotify will simply report only a write. The only evidence I currently get is that pyinotify reports a write and readline()
returns an empty string. BUT, it is possible that two subsiquent writes could trigger the same behavior.
I have thought of comparing a file's size to file.tell()
but according to the documentation tell
produces an opaque number and it appears this can't be trusted to be a number of bytes.
Is there a simple way to detect a file has been truncated while reading from it?
Edit:
Truncating a file can be simulated with simple shell commands:
echo hello > test.log
echo hello >> test.log
# Truncate test.log
echo goodbye > test.log
To compliment this, a simple python script can be used to confirm that file.tell()
does not reduce when the file is truncated:
foo = open('./test.log', 'r')
line = foo.readline()
while line != '':
print(foo.tell())
print(line)
line = foo.readline()
# Put a breakpoint on the following line and
# truncate the file before it executes
print(foo.tell())