I need to create a method that can read both xls and xlsx files. According to my research, HSSF is used to read xls and XSSF to read xlsx. Is there a part of the Apache POI I can use to read both files? I also came across the ss.usermodel but found no sufficient codes that will entertain both xls and xlsx....
-
WorkbookFactory.create() must be the accepted answer. http://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/WorkbookFactory.html – Kishan B May 15 '16 at 19:57
7 Answers
Yes, there's a new set of interfaces provided by POI that work with both types.
Use the WorkbookFactory.create() method to get a Workbook: http://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/WorkbookFactory.html
You can check for excel files without relying on file extensions (which are unreliable - many csv files have xls extensions for example but cannot be parsed by POI) using the following:
//simple way to check for both types of excel files
public boolean isExcel(InputStream i) throws IOException{
return (POIFSFileSystem.hasPOIFSHeader(i) || POIXMLDocument.hasOOXMLHeader(i));
}

- 2,704
- 16
- 28
-
Hello! Thanks for that code you posted. I find it better than the lengthy codes I have tried. – TOMAS DEL CASTILLO Oct 03 '13 at 03:10
-
-
I haven't had much exp with Apache POI, but as far as i know if you refer to a workbook by class "Workbook" then you can read and write both xls & xlsx.
All you have to do is when creating object write
for .xls-
Workbook wb = new HSSFWorkbook();
for .xlsx-
Workbook wb = new XSSFWorkbook();
you can pass a parameter for file type and create the WorkBook object accordingly using If statement.

- 437
- 4
- 12
-
Hello! I also have the said codes in mind and I was planning to use those if I cannot find any other way to detect the file extension. – TOMAS DEL CASTILLO Oct 03 '13 at 02:54
-
I recommend you to check this basic utility to read all excel files: https://github.com/eaorak/excelr – koders Mar 04 '15 at 18:59
you can read using poi-ooxml and poi-ooxml-schema jars provided by apache.
and use below code:--
Workbook wb = null;
excelFileToRead = new FileInputStream(fileName);
wb = WorkbookFactory.create(excelFileToRead);
Sheet sheet = wb.getSheet(sheetName);
the above code will read both xls and xlsx files

- 589
- 8
- 10
-
This is the best answer. If one day POI supports a third format, this code will still work. – Conffusion Apr 24 '19 at 08:31
Thanks to Tom's answer just to add, use foll. code to get inputstream else we may face Exception in thread "main" java.io.IOException: mark/reset not supported
InputStream inputStream = new FileInputStream(new File("C:\\myFile.xls"));
if(! inputStream.markSupported()) {
inputStream = new PushbackInputStream(fileStream, 8);
}

- 465
- 2
- 6
- 20
You can use
Workbook wb = WorkBookFactory().create(inputStream);

- 61
- 1
- 6
-
Using a `File` is generally better, see http://poi.apache.org/spreadsheet/quick-guide.html#FileInputStream – Gagravarr Jun 16 '17 at 12:17
one option would be to check the file name with lastIndexOf for . and see if it is .xls or xlsx and then use an if condition to switch accordingly. been a long time since i worked on poi but i think it the attributes are like HSSF for .xls and XSSF for .xlsx refer http://poi.apache.org/ site, last line under the topic Why should I use Apache POI?

- 1,007
- 3
- 12
- 29
It appears you are looking for a way to abstract the read process, you are saying it doesn't matter if its XLS or XLSX, you want your code to work without modification.
I'd recommend you to look at Apache Tika, its an awesome library that abstracts file reading and content parsing, it uses POI and many other libraries and has a nice abstraction to all of them.
reading a PDF/XLS/XLSX is similar to reading a text file, all the work is done behind the scene.
read this for more. http://www.searchworkings.org/blog/-/blogs/introduction-to-apache-tika

- 9,920
- 4
- 33
- 35
-
Thank you very much for suggesting Apache Tika, its the first time I read about that. I'll look into this and compare it with Apache POI – TOMAS DEL CASTILLO Oct 03 '13 at 03:08
-
Tika uses POI and many other libraries, its an abstraction to all of them, not a replacement. – Anantha Sharma Oct 03 '13 at 06:17