3

I have 2 excel files and i wanted to compare the contents and highlight the differences. For example:

first file...

name|age
abc|123
def|456
second file...
name|age
abc|123
def|456
ghi|789 - this being the differece

is there any third party libraries to do this? or what would be the best way to do it?

Aleks G
  • 56,435
  • 29
  • 168
  • 265
user1646537
  • 87
  • 1
  • 2
  • 10
  • 3
    If you are looking for reading/writing excel files, the Apache Poi (http://poi.apache.org/) library is the (at least my) first choice. – Dominik Sandjaja Oct 08 '12 at 10:09
  • 1
    http://stackoverflow.com/questions/866346/easiest-way-to-compare-two-excel-files-in-java – titogeo Oct 08 '12 at 10:24
  • the link posted by gTito might be very useful for you if you need to know whether 2 files are exactly the same. Mind that even if the content is the same, different font sizes mean that the files are completely different. – user1581900 Oct 08 '12 at 10:38

3 Answers3

7

Like DaDaDom said Apache POI is what you are looking for. You can download it from this page. Mind that POI project is not fully independent and you may need to download some extra libraries. Follow the instructions on Apache POI website. This is how you use it:

InputStream myxls = new FileInputStream("workbook.xls"));
HSSFWorkbook wb = new HSSFWorkbook(myxls); // for *.xlsx use XSSFWorkbook

If it's a new file you might need to create sheet before proceeding, but in this case the files are already created.

HSSFSheet sheet = wb.getSheetAt(0);       // first sheet
HSSFRow row     = sheet.getRow(0);        // first row
HSSFCell cell   = row.getCell((short)0);  // first cell

To get value from the cell use:

String value = cell.getStringCellValue();

However if the type stored in cell is numeric you would get an error. In case of numbers use:

Int value = cell.getCellValue();

This is a method I wrote to deal with different cell data types:

public String getValue(int x, int y){
    Row row = this.activeSheet.getRow(y);
    if(row==null) return "";
    Cell cell = row.getCell(x);
    if(cell==null) return "";
    int type = cell.getCellType();
    switch(type){
    case 0:
        return cell.getNumericCellValue() + "";
    case 1:
        return cell.getStringCellValue();
    case 2:
        return cell.getCellFormula();
    case 3:
        return "";
    case 4:
        return cell.getBooleanCellValue() + "";
    case 5:
        return cell.getErrorCellValue() + "";
    default:
        return "";
    }
}

I hope this quick introduction into Apache POI will help you with your project :)

user1581900
  • 3,680
  • 4
  • 18
  • 21
4

From this question, my answer partially duplicated below.

My project simple-excel which provides a bunch of Hamcrest Matchers and wraps up Apache POI's syntax.

When you do something like the following,

assertThat(actual, WorkbookMatcher.sameWorkbook(expected));

You'd see, for example,

java.lang.AssertionError:
Expected: entire workbook to be equal
     but: cell at "C14" contained <"bananas"> expected <nothing>,
          cell at "C15" contained <"1,850,000 EUR"> expected <"1,850,000.00 EUR">,
          cell at "D16" contained <nothing> expected <"Tue Sep 04 06:30:00">
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)

Read a blog post about it

Community
  • 1
  • 1
Toby
  • 9,523
  • 8
  • 36
  • 59
0

I would use epplus to load both documents into datatables and then iterate over them to find differences. Depending on how you want to highlight differences, you could simply color the cells with epplus and save them back to the files..

Christian Sauer
  • 10,351
  • 10
  • 53
  • 85