1

Somewhere online I've seen a technique to immediately unlink a temporary file after opening it, since you will discard it anyway. As per my understanding of the man-page for unlink, the file will not be unlinked as long as it has an open file descriptor.

When I execute the following piece of code:

char *file, *command;
asprintf(&file, "/tmp/tempXXXXXX");
int fd = mkstemp(file);
unlink(file);

asprintf(&command, "some_command > %s", file); /*Writes n bytes in temp file*/
FILE *f = popen(command, "re");
pclose(f);

struct stat sbuf;
fstat(fd, &sbuf);
printf("%i\n", sbuf.st_size);

close(fd);
free(file);
free(command);
exit(0); 

It will print a size of 0. However, if I comment unlink(file), it will show the correct file size. I would expect both scenarios to show the correct size of the file, since unlink should wait till no processes have the file open anymore. What am I missing here?

Markinson
  • 2,077
  • 4
  • 28
  • 56
  • 1
    You’re creating file X, removing it but keeping it open, and then creating *another* file with the same name. The original will not be written to ever so it stays empty. Unlink removes the link to the file from the filename in the filesystem, it just won’t free the space until all handles are closed. – Sami Kuhmonen Oct 07 '22 at 15:01
  • @SamiKuhmonen but it does show the correct number of bytes when I do not unlink the file. If popen creates another file with the same name (as I understand from your comment), I would expect the program to print 0 as well when removing the unlink command (so in both scenarios). – Markinson Oct 07 '22 at 15:07
  • @Markinson Re. `"popen creates another file with the same name"`: that's because the call to `unlink` removed the original from the file system's index so it's no longer accessible using its name. If you comment out the call to `unlink` then the original remains visible and is written to by the process started by `popen`. – G.M. Oct 07 '22 at 15:11

1 Answers1

1

You're missing the fact that the file referred to by your fd is not the same file as that created by your call to popen().

In a POSIX-like shell, some_command > some_file will create some_file if it does not already exist, otherwise it will truncate some_file.

Your call to popen() invokes a shell, which in turn creates or truncates the output file before invoking some_command as per POSIX.

Since you have unlinked some_file before the call to popen(), the file is created anew: that is, the output file set up by your popen() shell is allocated under a different inode than the (now anonymous) file created by your previous call to mkstemp().

You can see that the files are different if you compare st_ino values from your fstat() (by fd) and a separate call to stat() (by name) after the popen().

pilcrow
  • 56,591
  • 13
  • 94
  • 135
  • Thanks! My confusion comes from the following (man-page of unlink): `If the name was the last link to a file but any processes still have the file open, the file will remain in existence until the last file descriptor referring to it is closed.`. Therefore, I would not expect the file to be unlinked (or deleted) already, since it still has an open file descriptor. What am I missing? – Markinson Oct 10 '22 at 07:19
  • @Markinson the _file_ remains in existence — the data under its inode — but its name in the file system no longer refers to that data. References to that name will fail (ENOENT) or cause the allocation of new file storage under that name (O_CREAT). Does that make sense? We often don’t distinguish between a file’s path/name and its contents, but they are different. _unlink_ disconnects a name from the file beneath it. (With hard links of course the very same file may have more than one name.) – pilcrow Oct 10 '22 at 11:29
  • I see! I did some more research to the inner workings of filesystems and inodes and it makes sense now. Thanks for your time. – Markinson Oct 10 '22 at 12:33