3

I would like to check if two file handles refer to the same file. In order to do this, can I use the stat function applied to each file handle? Thanks in advance

my $file = 'C:\temp\file.txt');
open( TXT1, "> $file" );
open( TXT2, "> $file" );
print( "The handles refer to the same file!") if (\TXT1 eq \TXT2);
ikegami
  • 367,544
  • 15
  • 269
  • 518
fabrizio
  • 43
  • 1
  • 4
  • 2
    If you are on a filesystem that supports it, you can compare the `dev` and `ino` values from stat. However, your filename indicates you're on Windows, where these values are not meaningful. – Grinnz Feb 07 '19 at 18:11
  • Can't do that in general. An existing filehandle need not be associated with a file at all; could be a socket/pipe/etc. Can you change how the files are opened in that program? Then you can add a hash to keep track of them using `fileno` for instance – zdim Feb 07 '19 at 18:36
  • 1
    Note: it is not filehandle**R** (with "r" at the end), it is filehandle (no "r"). It's a handle -- something associated with a file and given to us, so that we can handle the file. – zdim Feb 07 '19 at 18:53
  • Seconding @zdim's advice to use lexical filehandles, three-arg open, and error handling. The [File::Open](https://metacpan.org/pod/File::Open) module can wrap all of this up nicely for you, as well as modules like [Path::Tiny](https://metacpan.org/pod/Path::Tiny). – Grinnz Feb 07 '19 at 19:47
  • @zdim, It's called a handle because it allows us to hold onto a system resource. – ikegami Feb 08 '19 at 05:58
  • @ikegami Eh, I guess that's right. I like to think of a "handle" being something that allows me to ... handle the thing. But I guess it allows me to hold it .. um ... little circular – zdim Feb 08 '19 at 07:08

2 Answers2

4

That can't be done in general (portably), what is understandable given that a "file"handle need not be associated with a file at all. One thing you can do is to record the fileno for each filehandle.

So when opening a file

my %filename_fileno;

open my $fh, '>', $file or die "Can't open $file: $!";

$filename_fileno{fileno $fh} = $file;

and then you can look it up when needed

say "Filename is: ", $filename_fileno{fileno $fh};

Don't forget to remove the entry from the hash when that file is (to be) closed

delete $filename_fileno{fileno $fh};
close $fh;

So these should be in utility functions. Given that more care is needed, as outlined in the footnote , altogether this would make for a nice little module. Then one can also consider to extend (inherit from) a related module, like Path::Tiny.

Note: You cannot write to a file from separate filehandles like in the question. Operations on each filehandle keep track of where that filehandle was last in the file, thus writes will clobber intermediate writes by the other filehandle.

Note: Use lexical filehandles (my $fh) and not globs (FH), use the three-argument open, and always check the open call.


  On some (most?) Linux systems you can use /proc filesystem

say readlink("/proc/$$/fd/" . fileno $fh);

and on more (all?) Unix-y systems can use the (device and) inode number

say for (stat $fh)[0,1];

  Links, both soft (symbolic) and hard, can be used to change the data and have different names. So we can have different filenames but same "file" (data).

On Windows systems the best way to check is given in this post, except for the hardlink case for which one would have to use the other answer's method (parse output), as far as I can tell.

Also, non-canonical names, as well as different capitalizations (on case insensitive systems), short/long names on some systems, (more?) ... can make for different names for the same file. This is easier to clean up, using modules, but needs to be added as well.

On most (all?) other systems the notion of inode and any available stat-like functionality makes these a non-issue, since device+inode refers uniquely to data.

Thanks to ikegami for comments on this.

zdim
  • 64,580
  • 5
  • 52
  • 81
2

Yes, on some system+device combinations, stat can be used.

use File::stat;

my $st1 = stat($fh1)
   or die $!;
my $st2 = stat($fh2)
   or die $!;

say $st1->dev == $st2->dev && $st1->ino == $st2->ino ? "same" : "different";

Notably, this won't work on NTFS or FAT32, one of which you appear to be using.

You should have your program keep track of that itself, but that's easier said than done. It's rather hard to identify if two paths refer to the same file when you have to contend with hard links, soft links, paths with ./.., capitalization/slash differences, short/long names, conventional/UNC paths, shares, etc.

For example, all of the following paths could refer to the same file:

  • C:\Moo\Foo Bar.txt
  • C:\Hardlink\Foo Bar.txt
  • C:\Softlink\Foo Bar.txt
  • C:\.\Moo\Foo Bar.txt
  • C:\Baz\..\Moo\Foo Bar.txt
  • c:\moo\foo bar.txt
  • C:/Moo/Foo Bar.txt
  • C:\Moo\FOOBAR~1.TXT
  • \\?\C:\Moo\Foo Bar.txt
  • \\127.0.0.1\C$\Moo\Foo Bar.txt
  • Z:\Moo\Foo Bar.txt

All but the last two can be handled via normalization.

ikegami
  • 367,544
  • 15
  • 269
  • 518