1

I'm trying to verify the time of most recent modification of a file and got to the following:

    print("before", time.time())
    with open(file, "wb") as fh:
        fh.write(b"12345")
    print("after", time.time())
    print("modified", os.path.getmtime(file))

I expect before < modified < after. But the output reads:

before   1693392599.8838775
after    1693392599.8839073
modified 1693392599.8792782

What am I missing? Thank you.

deceze
  • 510,633
  • 85
  • 743
  • 889
Gerry
  • 1,938
  • 3
  • 18
  • 25
  • Maybe mtime has a different resolution? – Bharel Aug 30 '23 at 10:58
  • I'd guess the result would depend on the actual filesystem .. Running the same code in mac and i get similar results as shown here but on windows, things are as you where expecting.. – rasjani Aug 30 '23 at 11:19
  • 2
    Different file systems will have different time resolutions that don't match the OS's time resolution. In NTFS for example the resolution is 100ns, in APFS it's 1ns while in FAT32 (used in memory cards) it's 2s and in HFS+ it's 1s. At the 100ns resolution, delays caused by thread-switching alone may be enough to explain the difference. Never mind 1ns – Panagiotis Kanavos Aug 30 '23 at 11:19
  • If you want the modification time to detect changes, you should probably use the file system's journaling support. The modification time can be manipulated, may not have the resolution you want or it may reflect the time a file was *opened* for modification, not when the file stream was closed. If, for example you preallocate a 100MB file and then write 50MB to it, which time will be the modification time? The start of the write or the end? – Panagiotis Kanavos Aug 30 '23 at 11:26
  • @PanagiotisKanavos If this is purely a matter of different resolutions, why is the mtime consistently *earlier* than the "before" time? I tried this with bash on linix using `date` and `stat` and get similar results. – ekhumoro Aug 30 '23 at 11:59
  • The shell doesn't matter, the file system does. `time.time()` isn't guaranteed to be more accurate than 1s and doesn't get its value from a core's high-precision timer. You need `time.monotonic_ns()` or `perf_counter_ns()` for that. That's at least one reason `time()` isn't suitable for benchmarking. The returned value may be inside the file system's resolution window too. In a memory card, the modification time will always be the previous second. – Panagiotis Kanavos Aug 30 '23 at 12:27
  • Another reason is that `float` itself is susceptible to rounding errors. The `_ns` functions exist to avoid this. As the docs say though, neither `monotonic_ns()` nor `perf_counter_ns()` represent an actual datetime and are only suitable for calculating differences – Panagiotis Kanavos Aug 30 '23 at 12:32
  • @PanagiotisKanavos That's exactly why I tested with the shell - so as to rule out any python-specific differences. I already tried eliminating FP issues by using `time_ns` and `st_mtime_ns`, but it doesn't make enough difference. The offset is always much less than a second, so I can't see how anything you've suggested so far accounts *fully* for the consistently earlier mtime. I suspect the *reported* precision of the filesystem mtime just doesn't reliably match the precision of the time as determined by the kernel. – ekhumoro Aug 30 '23 at 13:15
  • Isn't that what I and others have been saying all along? Reading as a 64-bit won't make that inaccurate value more accurate. The clock you compare it to isn't accurate either, it depends on a real-time clock powered by a batter, modified by a time server and updated infrequently. The *only* accurate values are provided by `perf_counter_ns`. – Panagiotis Kanavos Aug 30 '23 at 13:17
  • @PanagiotisKanavos Not exactly, no. You suggested various ways to account for ***a*** difference, but not a specifically earlier difference. What I meant by my previous comment is that I suspect the system just may not offer any other way to get at this specific information. Perhaps it *could*, but it just doesn't. – ekhumoro Aug 30 '23 at 13:44

1 Answers1

0

The python documentation says the following about the precision of timestamps returned by time.time():

The precision of the various real-time functions may be less than suggested by the units in which their value or argument is expressed. E.g. on most Unix systems, the clock “ticks” only 50 or 100 times a second.

I presume getmtime() just reads the file's metadata, so if time.time() is not precise enough it could explain the slight discrepancy in timestamps. What an interesting find.

Jan Morawiec
  • 425
  • 2
  • 11
  • Actually the in the next paragraph the documentation suggests that the `time()` function might actually be *more* precise than the file timestamp on some environments. – Jan Morawiec Aug 30 '23 at 11:23