2

I'm trying to convert a XLS file into a CSV file in java using Apache POI 3.9, however I'm getting some issues. When trying to convert the file I need to, it shows me the following error:

java.io.IOException: Invalid header signature; read 0x0010000000080209, expected 0xE11AB1A1E011CFD0
    at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:140)
    at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104)
    at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:138)
    at ExtractExcelToCSV.convertExcelToCsv(ExtractExcelToCSV.java:26)
    at ExtractExcelToCSV.main(ExtractExcelToCSV.java:60)

I think the code I'm using is completely correct (and it also works with other files). I think the problem is on XLS file because when I try to open it using MS Excel it also shows me a warning about the file type (it says it is a MS Excel 3 Worksheet). Is there any way I can open these files using POI?

public static void convertExcelToCsv() throws IOException {
        try {
            cellGrid = new ArrayList<List<HSSFCell>>();
            FileInputStream myInput = new FileInputStream("D:\\...\\filename.xls");



            POIFSFileSystem myFileSystem = new POIFSFileSystem(myInput);
            HSSFWorkbook myWorkBook = new HSSFWorkbook(myFileSystem);
            HSSFSheet mySheet = myWorkBook.getSheetAt(0);
            Iterator<?> rowIter = mySheet.rowIterator();

            while (rowIter.hasNext()) {
                HSSFRow myRow = (HSSFRow) rowIter.next();
                Iterator<?> cellIter = myRow.cellIterator();
                List<HSSFCell> cellRowList = new ArrayList<HSSFCell>();
                while (cellIter.hasNext()) {
                    HSSFCell myCell = (HSSFCell) cellIter.next();
                    cellRowList.add(myCell);
                }
                cellGrid.add(cellRowList);
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
fabiocatalao
  • 21
  • 1
  • 4
  • 1
    Pretty sure you file is neither a .xls nor a .xlsx file, hence the error. If you load it in Excel and do a save-as, what format does it display it as currently being? – Gagravarr Nov 18 '13 at 18:38
  • Hi Gagravarr. When I try to save it, it says it is not possible to save on the same format: https://www.dropbox.com/s/v9jyfceo3m2b0hk/Screenshot%202013-11-18%2019.00.12.png Here is the info it says about Excel 3 worksheet format: https://www.dropbox.com/s/k5gd3p9ap8n72f3/Screenshot%202013-11-18%2019.02.11.png – fabiocatalao Nov 18 '13 at 19:03
  • Looks like it's an absolutely ancient version of Excel. POI only supports Excel 97 or newer (i.e. anything from the last 15 years!), your file is practically a fossil and not supported, sorry... – Gagravarr Nov 18 '13 at 21:55
  • Do you know if there is any other way to handle with this file? I tried to find it, but I haven't found anything until now. I really have to use this file as it is the version a SAS platform give me as data export. – fabiocatalao Nov 19 '13 at 10:17
  • Try it in OpenOffice? – Gagravarr Nov 19 '13 at 11:14
  • As an alternative... please see http://stackoverflow.com/questions/17345696/convert-xlsx-to-csv-with-apache-poi-api... may be this will be helpful for you. – Sankumarsingh Nov 19 '13 at 11:57
  • I solved my problem using OpenOffice API. Here is the solution I have used: http://openofficejava.blogspot.pt/2009/05/openofficeorg-api.html Thanks! :D – fabiocatalao Nov 19 '13 at 12:54

1 Answers1

2

Got a similar issue. Even if the file had extension .xls, it was NOT an Excel file! Thanks to the comment here of doing "Save-As" in Excel, it might tell what the format is. In my case it was a tab-delimited file so I parsed it without using Apache POI. Hope this helps.

dangig
  • 179
  • 7