6

I had written a small utility for creating xml for any folder structure and comparison of folders via generated xml that supports both win and Mac as platforms. However on Mac, recursively calculating folder size don't adds up to total size. On investigation, it came that it is due to extended attributes and resource forks that were present on certain files.

Can anybody know how can I determine these extended attributes and resource forks and their size preferably in python. Currently, I am using os.path.getsize to determine the size of file and adding files size to determine folder size eventually.

hippietrail
  • 15,848
  • 18
  • 99
  • 158
Gagandeep Singh
  • 5,755
  • 4
  • 41
  • 60
  • There is the [xattr](https://github.com/xattr/xattr) module you can use to get a list of them, but not sure it would help you determine the exact size they take up. – jterrace Sep 27 '11 at 18:02
  • @jterrace Thanks but determine attributes alone can't help me. I need to know their size also. – Gagandeep Singh Sep 27 '11 at 18:08

3 Answers3

3

Merely a partial answer ... but to learn the size of resource forks you can simply use the namedfork psuedodirectory

os.path.getsize("<path to file of interest>/..namedfork/rsrc")

Its theoretically possible that other named forks may exist ... but you can't discover a list of available forks.

As to the extended attributes ... what "size" are you interested in? You can use the xattr module to discover their content and thus the length of the key/value pairs.

But if you are interested more in their "on disk" size ... then its worth noting that extended attributes are not stored in some sort of file. They form part of the file metadata (ie just like the name and modified time are metadata) and are stored directly within a B*-tree node, rather than some "file"

donkopotamus
  • 22,114
  • 2
  • 48
  • 60
  • FileManager has APIs to get a list of available forks, such as `FSIterateForks`. No idea if a Python interface to it has been made, or would be easy to add... – hippietrail Mar 14 '21 at 03:12
3

You want the hidden member of a stat result called st_blocks.

>>> s = os.stat('some_file')
>>> s
posix.stat_result(st_mode=33261, st_ino=12583347, st_dev=234881026,
                  st_nlink=1, st_uid=1000, st_gid=20, st_size=9889973,
                  st_atime=1301371810, st_mtime=847731600, st_ctime=1301371422)
>>> s.st_size / 1e6 # size of data fork only, in MB
9.889973
>>> x.st_blocks * 512e-6 # total size on disk, in MB
20.758528

The file in question has about 10 MB in the resource fork, which shows up in the result from stat but in a "hidden" attribute. (Bonus points for anyone who knows exactly which file this is.) Note that it is documented in man 2 stat that the st_blocks attribute always measures increments of 512 bytes.

Note: st_size measures the number of bytes of data, but st_blocks measures size on disk including the overhead from partially used blocks. So,

>>> open('file.txt', 'w').write('Hello, world!')
13
>>> s = os.stat('file.txt')
>>> s.st_size
13
>>> s.st_blocks * 512
4096

Now if you do a "Get Info" in the Finder, you'll see that the file has:

Size: 4 KB on disk (13 bytes)

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
2

Two options:

You could try using subprocess to call the system's "ls" or "du" command, which should be aware of the extended attributes.

or

You could install the xattr package, which can read the resource fork in addition to extended attributes (it's accessed via xattr.XATTR_RESOURCEFORK_NAME. Something like this might work:

import xattr

x = xattr.xattr("/path/to/my/file")

size_ = 0
for attribute in x:
    size_ += len(x[attribute])

print size_

You might need to play around a little with the format of the extended attributes, as they're returned as strings but might be binary (?).

If you provide a minimal almost working example of code, I might be able to play with it a little more.

Noah
  • 21,451
  • 8
  • 63
  • 71