6

I'm having a little difficulty understanding how exactly this works.

It seems that unlink() will remove the inode which refers to the file's data, but won't actually delete the data. If this is the case,

a) what happens to the data? Presumably it doesn't stick around forever, or people would be running out of disk space all the time. Does something else eventually get around to deleting data without associated inodes, or what?

b) if nothing happens to the data: how can I actually delete it? If something automatically happens to it: how can I make that happen on command?

(Auxiliary question: if the shell commands rm and unlink do essentially the same thing, as I've read on other questions here, and Perl unlink is just another call to that, then what's the point of a module like File::Remove, which seems to do exactly the same thing again? I realize "there's more than one way to do it", but this seems to be a case of "more than one way to say it", with "it" always referring to the same operation.)

In short: can I make sure deleting a file actually results in its disk space being freed up immediately?

CFK
  • 83
  • 1
  • 3
  • 2
    `File::Remove` does recursion, `rmdir`, and globing for you, while `unlink` does not. What exactly do you mean by `data deleting`? AFAIK disk space gets freed after unlink, and all PIDs close their file handle for it. – mpapec Oct 21 '14 at 12:47
  • @mpapec: `unlink` just removes a link. The space won't be reclaimed until there are no longer any links to the data – Borodin Oct 21 '14 at 12:57

3 Answers3

12

Each inode on your disk has a reference count - it knows how many places refer to it. A directory entry is a reference. Multiple references to the same inode can exist. unlink removes a reference. When the reference count is zero, then the inode is no longer in use and may be deleted. This is how many things work, such as hard linking and snap shots.

In particular - an open file handle is a reference. So you can open a file, unlink it, and continue to use it - it'll only be actually removed after the file handle is closed (provided the reference count drops to zero, and it's not open/hard linked anywhere else).

Sobrique
  • 52,974
  • 7
  • 60
  • 101
  • @ysth, In that post, "it" is the inode, so it disappears when Sobrique says, not when you said. The directory entry disappears as soon as `unlink` is called, though. – ikegami Oct 21 '14 at 18:45
3

unlink() removes an link (a name if you want, but technically a record in some directory file) to the data (referred by an inode). Once there is no more link to the data, the system automatically free the associated space. The number of links to an inode is tracked into the inode. You can observe the number of actual links to a file with ls -l for example :

789994 drwxr-xr-x+  29 john  staff      986 11 nov  2010 SCANS
 23453 -rw-r--r--+   1 erik  staff      460 19 mar  2011 SQL.java

This means that the inode 789994 has 29 links to it and that inode 23453 has only 1. SQL.java is an entry into the current directory which points to inode 23453, if you remove that record from the directory (system call unlink or command rm) then the count goes to 0 and the system free the corresponding space, because if the count is 0 then this means that there is no more link/name to access the data! So it can be freed.

Jean-Baptiste Yunès
  • 34,548
  • 4
  • 48
  • 69
1

Removing the link just means the space is no longer reserved for a certain file name. The data will be erased/destroyed when something else is allocated to that space. This is why people write zeroes or random data to a drive after deleting sensitive data like financial records.

Shawn Darichuk
  • 380
  • 4
  • 9
  • 1
    overwriting is generally good for erasure, but unlink isn't directly connected to space reservations - it's purely references via directory entries. – Sobrique Oct 22 '14 at 13:45