Difference between creating a duplicate file descriptor using dup() and creating a hard link?

Question

I just tried out this program where I use dup to duplicate the file desciptor of an opened file.

I had made a hard link to this same file and I opened the same file to read the contents of the file in the program.

My question is what is the difference?

I understand that dup gives me a run time abstraction to the file and that hard link refers more to the filsystem implementation but I do not understand the need for use of one over the other.

What are the advantages of using one over the other?

Why can't we explicitly refer to the hard link if we want to refer to the same file locations instead of creating a file descriptor and vice versa?

I am using Linux and the standard C library.

Basile Starynkevitch · Answer 1 · 2012-09-09T07:51:43.370

Hard links work on i-nodes, dup works on opened file descriptors. These are different animals.

A file is mostly an inode, with directory entries pointing to that inode (so some file can have more than one name thru hard links, other files can have no name at all: a temporary file still opened but unlinked has an i-node refered by an opened file descriptor, but no more any name). I-nodes exist for the duration of the file and are written to disks.

A file descriptor only exist in processes (in kernel memory only, not on disk) so can't be written to disk (you could only write its number, which usually don't make any sense). A file descriptor knows (inside the kernel) its inode, but also some more state, notably the current offset.

You could have two file descriptors working on the same file (the same inode, perhaps by open-ing two different hardlinked or symlinked paths to it) but having different state (e.g. different file position or offset).

If using dup(2) syscall, the two file descriptors share the same state (just after the dup) in particular share the same file offset or position.

If using link(2) syscall, the two directory entries point to the same inode. They need to be on the same filesystem.

And a symlink(2) syscall creates a new inode (and a new file) which refers to the symbolic name. Read other man pages about path_resolution(7) and symlink(7).

This means that the file opened by some process and unlinked will still be accessible after all the links to it are removed. — Arpith, Sep 09 '12 at 07:12
Yes, this is how temporary files are done. But you can access that file only thru previously opened file descriptors. The file is removed on the disk only when the last process opening it has terminated. — Basile Starynkevitch, Sep 09 '12 at 07:22
Is there any advantage in using dup to access contents of same file over opening the hard link of same file in a program? Both are system calls after all. — Arpith, Sep 09 '12 at 07:24

score 0 · Answer 2 · answered Sep 09 '12 at 06:59

0

A hard link is just a way to have the same file in two different directories. It is useful for saving some disk space.

Using fdup lets you have two different file descriptors in your program that point to the same file. It is useful if you want to duplicate some kind of logical object that wraps a file descriptor.

answered Sep 09 '12 at 06:59

David Grayson

84,103
24
152
189

1

How is it helpful in saving disk space? They point to the the same inode. Size info is in the metadata not in the hard link. – Arpith Sep 09 '12 at 07:02
It's an optimization over having two identical files. – tripleee Sep 09 '12 at 07:10
Notice also that a file descriptor has additional state. If you open a file, read a byte, then duplicate the descriptor, both descriptors will return the second byte from the file on the next read. – tripleee Sep 09 '12 at 07:12

score 0 · Answer 3 · answered Sep 09 '12 at 07:30

0

The main difference is that a hard link is persistent and a duplicated file descriptor only lasts as long as the process. Plus the reasons already given.

answered Sep 09 '12 at 07:30

Alex Brown

41,819
10
94
108

OK I think I understand now. Is there any reason why dup is preferred over opening the same file twice? The kernel will obviously allocate two different file descriptors and they will point to the same file. Are there any delay or latency issues with open not present in using dup? – Arpith Sep 09 '12 at 07:37
I can't see why you would want a hard link. Hard links serve a specific purpose to solve sharing problems in the file system namespace. I don't think that's what you are after. – Alex Brown Sep 09 '12 at 07:40

Difference between creating a duplicate file descriptor using dup() and creating a hard link?

3 Answers3