stat
is a POSIX system call (available on Linux, Unix and even Windows) which returns a bunch of information (size, type, protection bits...)
Python has to call it at some point to get the size (and it does), but there's no system call to get only the size.
So they're the same performance-wise (maybe faster with stat
but that's only 1 more function call so not I/O related). It's just that os.path.getsize
is simpler to write.
that said, to be able to call os.path.getsize
you have to make sure that the path is actually a file. When called on a directory, getsize
returns some value (tested on Windows) which is probably related to the size of the node, so you have to use os.path.isfile
first: another call to os.stat
.
In the end, if you want to maximize performance, you have to use os.stat
, check infos to see if path is a file, then use the st_size
information. That way you're calling stat
only once.
If you're using os.walk
to scan the directory, you're exposed to more hidden stat
calls, so look into os.scandir
(Python 3.5).
Related: