24

I don't know about you guys but at least I expected that f1 would be equal to f2 in the below code but apparently that's not the case! What's your thoughts about this? It seems like I have to write my own equals method to support it, right?

import java.io.*;

public class FileEquals
{
    public static void main(String[] args)
    {
        File f1 = new File("./hello.txt");
        File f2 = new File("hello.txt");
        System.out.println("f1: " + f1.getName());
        System.out.println("f2: " + f2.getName());
        System.out.println("f1.equals(f2) returns " + f1.equals(f2));
        System.out.println("f1.compareTo(f2) returns " + f1.compareTo(f2));
    }
}
aandeers
  • 431
  • 1
  • 6
  • 19
  • 1
    The same happens with Java 7's Path class. But there exist methods like Path.normalize() or Files.isSameFile() – Luciano Jan 19 '12 at 18:28
  • You could safe all viewer of this question some time by showing the actual output. I was expecting that `equals` and `compareTo` had contradicting results. This is not the case, `equals` returns false and `compareTo` returns -58, meaning lexicographically "less than". @Luciano: Note that `Files.isSameFile` would in this case try to open the files since the paths are not equal and could fail with `NoSuchFileException`. – bluenote10 May 05 '16 at 16:12

7 Answers7

34

Not, it's not the case. Because equals is comparing equality of absolute paths (in your case above it is something like:

some-project\.\hello.txt
some-project\hello.txt

So they are naturally different.

It seems like I have to write my own equals method to support it, right?

Probably yes. But first of all, you have to know what you want to compare? Only pathnames? If yes, compare its canonical path in this way:

f1.getCanonicalPath().equals(f2.getCanonicalPath())

But if you want compare content of two different files, then yes, you should write your own method - or simply just copy from somewhere on the internet.

halfer
  • 19,824
  • 17
  • 99
  • 186
G. Demecki
  • 10,145
  • 3
  • 58
  • 58
  • 1
    I actually want to do something like "fileList.contains(file)" and this method calls the equals method. – aandeers Jan 19 '12 at 18:14
  • The answer let me feel confuse. See the source code in the UnixFileSystem.java in jdk : public int compare(File f1, File f2) { return f1.getPath().compareTo(f2.getPath()); }@G.Demecki I am not agree with : equals is comparing equality of absolute paths – linjiejun Jul 20 '19 at 09:20
9

To properly test equals, you must call getCanonicalFile(). e.g.

public static void main(String[] args) throws IOException
   {
       File f1 = new File("./hello.txt").getCanonicalFile();
       File f2 = new File("hello.txt").getCanonicalFile();
       System.out.println("f1: " + f1.getAbsolutePath());
       System.out.println("f2: " + f2.getAbsolutePath());
       System.out.println("f1.equals(f2) returns " + f1.equals(f2));
       System.out.println("f1.compareTo(f2) returns " + f1.compareTo(f2));
   }

Will return true for equals. Note that getCanonicalFile may throw an IOException so I added that to the method signature.

user949300
  • 15,364
  • 7
  • 35
  • 66
5

If you only want to compare the CONTENTS of each file, you could read the contents into a byte array like this:

byte[] f1 = Files.readAllBytes(file1);
byte[] f2 = Files.readAllBytes(file2);

And then compare exactly what you want from there.

Note that this method call only exists in Java 7. For older versions, Guava and Apache have methods to do similar but with different names and details.

Edit: OR a better option (especially if you're comparing large files) might be to simply compare byte by byte rather than loading the entire file into memory, like this:

FileInputStream f1 = new FileInputStream(file1);
DataInputStream d1 = new DataInputStream(f1);
FileInputStream f2 = new FileInputStream(file2);
DataInputStream d2 = new DataInputStream(f2);

byte b1 = d1.readByte();
byte b2 = d2.readByte();

And then compare from there.

Peggy
  • 394
  • 6
  • 22
Brian Snow
  • 1,133
  • 1
  • 12
  • 23
  • +1 Nice post - I learned something today (I haven't used Java 7 yet, glad to see they added a Files utility) – user949300 Jan 19 '12 at 18:25
  • 1
    I would compare the files' size first, if it's available. – Luciano Jan 19 '12 at 18:29
  • 6
    it is an incredibly bad idea to compare files like that – unbeli Jan 19 '12 at 18:43
  • @Luciano yes, testing file size first is a good idea. I don't know why size would not be available, but, if it weren't, then test `(f1.length == f2.length)` – user949300 Jan 19 '12 at 18:58
  • 1
    @unbeli Please elaborate. I've used similar code in a lot of unit tests, where one file contains correct results and one file contains the results generated by the program/algorithm. That isn't what OP wants to do (as he has since elaborated) but Brian said CONTENTS and he even capitalized it. – user949300 Jan 19 '12 at 19:01
  • @unbeli I'm also hoping you can elaborate on your comment. – Brian Snow Jan 19 '12 at 19:43
  • @Brian Snow, think about this: if the first byte of these two files is different, why reading all of it? What if the files are large? Do you really need both files in memory? – unbeli Jan 19 '12 at 23:02
  • @unbeli Now that Wikipedia is back up, [link](http://en.wikipedia.org/wiki/Disk-drive_performance_characteristics#Data_transfer_rate) typical HDD throughputs are, as I understand the article, 1000Mbit per second, or ~100MB per second. So, unless you have a performance requirement that this comparison be done in less than a couple of seconds, it is just fine for files up to 100MB. – user949300 Jan 20 '12 at 19:23
  • @unbeli Also, I was looking at files in unit tests where you expect them to be equal. If the files are unlikely to be equal, and they are large, then you are absolutely right that this is a bad idea. – user949300 Jan 20 '12 at 20:02
  • @user949300 it does not matter if files are expected to be equal or not. It also does not matter what the HDD throughput is (and no, you got it wrong). – unbeli Jan 20 '12 at 21:45
  • @Brian Snow, what you wrote is not my idea. Please remove that claim, thank you. – unbeli Jan 20 '12 at 21:46
  • @unbeli If files are expected to be equal 99% of the time, 99% of the time you have to read every byte. – user949300 Jan 20 '12 at 22:03
  • @user949300 possibly, but you never have to keep both files in memory. – unbeli Jan 21 '12 at 09:12
2

The quicker way I found to diff on two files is below.

That's just proposition to work it around.

Not sure about the performance (what if files are 10 GB each?)

    File file = new File("/tmp/file.txt");
    File secondFile = new File("/tmp/secondFile.txt");

    // Bytes diff
    byte[] b1 = Files.readAllBytes(file.toPath());
    byte[] b2 = Files.readAllBytes(secondFile.toPath());

    boolean equals = Arrays.equals(b1, b2);

    System.out.println("the same? " + equals);

    // List Diff
    List<String> c1 = Files.readAllLines(file.toPath());
    List<String> c2 = Files.readAllLines(secondFile.toPath());

    boolean containsAll = c1.containsAll(c2);
    System.out.println("the same? " + containsAll);                
}

EDIT

But still, diff utility on unix system would be much quicker and verbose. Depends what you need to compare.

DevDio
  • 1,525
  • 1
  • 18
  • 26
1

If you just want to check if the files are the same based on their path use

java.nio.file.Files#isSameFile

E.g.

Assert.assertTrue(Files.isSameFile(
     new File("some-project\.\hello.txt").toPath(),
     new File("some-project\hello.txt").toPath()
));
EliuX
  • 11,389
  • 6
  • 45
  • 40
1

Here is the implementation of both methods:

/**
 * Tests this abstract pathname for equality with the given object.
 * Returns <code>true</code> if and only if the argument is not
 * <code>null</code> and is an abstract pathname that denotes the same file
 * or directory as this abstract pathname.  Whether or not two abstract
 * pathnames are equal depends upon the underlying system.  On UNIX
 * systems, alphabetic case is significant in comparing pathnames; on Microsoft Windows
 * systems it is not.
 *
 * @param   obj   The object to be compared with this abstract pathname
 *
 * @return  <code>true</code> if and only if the objects are the same;
 *          <code>false</code> otherwise
 */
public boolean equals(Object obj) {
    if ((obj != null) && (obj instanceof File)) {
        return compareTo((File)obj) == 0;
    }
    return false;
}
/**
 * Compares two abstract pathnames lexicographically.  The ordering
 * defined by this method depends upon the underlying system.  On UNIX
 * systems, alphabetic case is significant in comparing pathnames; on Microsoft Windows
 * systems it is not.
 *
 * @param   pathname  The abstract pathname to be compared to this abstract
 *                    pathname
 *
 * @return  Zero if the argument is equal to this abstract pathname, a
 *          value less than zero if this abstract pathname is
 *          lexicographically less than the argument, or a value greater
 *          than zero if this abstract pathname is lexicographically
 *          greater than the argument
 *
 * @since   1.2
 */
public int compareTo(File pathname) {
    return fs.compare(this, pathname);
}
Eng.Fouad
  • 115,165
  • 71
  • 313
  • 417
0

If you are using windows see class Win32FileSystem

The comparison method is like below, so it is very normal that your file objects are different.

    public int compare(File f1, File f2) {
      return f1.getPath().compareToIgnoreCase(f2.getPath());
    }

Add those lines to your code as well

        System.out.println(f1.getPath());
        System.out.println(f2.getPath());

and it will print

.\hello.txt
hello.txt

Hence they are not equal as the comparison is made using path proeprty of File object

fmucar
  • 14,361
  • 2
  • 45
  • 50