0

I am in the middle of writing some software in C that recursively lists all files in a given directory and now I need to work out the internal fragmentation.

I have spent a long time researching this and have found out that the internal fragmentation on ext2 only occurs in the last block. I know that from an inode number in theory you should be able to get the first and last block addresses but I have no idea how.

I have looked into stat(), fcntl() and all sorts of ways. How do I get the last block address from an inode number?

I have also figured out that once I have the address of the last block that I can test to see how much free space is in that block and this will give me the internal fragmentation.

I know that there is a get_inode and a get_block command but have no idea apart from that!

ymn
  • 2,175
  • 2
  • 21
  • 39
Charlie
  • 1,308
  • 5
  • 14
  • 24
  • check, how `filefrag` utility works. http://linux.die.net/man/8/filefrag says it uses FIEMAP or FIBMAP - it is ioctl's – osgx Dec 11 '11 at 21:57
  • thanks, im looking into trying to find filefrags code now to see how it does it... – Charlie Dec 11 '11 at 22:20
  • Ollie, it is part of e2fsprogs and path is `/misc/filefrag.c`. This utility is linux-specific and may not work with some FS (EXT2/3/4 are supported) – osgx Dec 12 '11 at 12:41

2 Answers2

1

I don't think you can get at the addresses of disk block via the regular system calls such as stat(). You would probably have to find the raw inode on disk (which means accessing the raw disk, and requires elevated privileges) and processing the data from there.

Classically, you'd find direct blocks, indirect blocks, double-indirect blocks and a triple-indirect block for a file. However, the relevant file system type is about as dead as the dodo is (I don't think I've seen that file system type this millennium), so that's unlikely to be much help now.

There might be a non-standard system call to get at the information, but I doubt it.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • SO it will be a massive pain in the ass? This is what my documentation says '3. The next step would be to work out how you can traverse through all of the directories and access each file’s starting i-node. From that point, you can then identify the last block of the file and work out how much space is left within the block.' So maybe there are lying to me! – Charlie Dec 11 '11 at 21:35
  • Yes, it will be a massive PitA. Given the inode, you can (in theory) find the last block of the file by reading the inode itself from disk (hence the need for privileged access to the raw disk) and determining where the last block is stored, etc. _OTOH_, let's step back...isn't the size of the file, modulo the disk block size, the amount of space in use in the last block? This doesn't give you the disk addresses of those blocks - but it does give you the information you need to deduce the internal fragmentation, doesn't it? – Jonathan Leffler Dec 11 '11 at 22:12
  • yeah but I'm working to a degree coursework spec, I wish i could go changing what they have said we need to do but i cant :-( I have spent about 2 days looking into this now and it is really starting to wind me up! – Charlie Dec 11 '11 at 22:15
1

Maybe you think too complicated, but roughly the internal fragmentation should be able to calculated if you divide the file size by the block size and take the modulo.

But this is only valid if the file is a "classic one" - with sparse files or files holding much "other information" (such as huge ACLs or extended attributes), there might be a difference. (I don't know where they are stored, but I could imagine that there could be file systems storing them in the last block, effectively (but unnoticedly) reducing the internal fragmentation.)

glglgl
  • 89,107
  • 13
  • 149
  • 217