0

This is the code that I have for reading a very large excel file (xlsx) that is 23.5MB with 700,000+ rows.

String dir = rootPath + File.separator + "tmpFiles" + File.separator 
+ FILE_NAME;
File fisNew = new File(dir);
Workbook w = StreamingReader.builder()
.rowCacheSize(100)
.open(fisNew);
Sheet worksheet = null;
worksheet = w.getSheetAt(0);
worksheet.getRow(0).getPhysicalNumberOfCells();

I get an UnsupportedOperationException Null pointer error on this line: worksheet.getRow(0).getPhysicalNumberOfCells(); And I also don't get an actual String value when I print out this line: SpecialtyUtil.removeWhiteSpaces(excelheader.getCell(0)). I am supposed to get the name of the column but I get some StreamingSheet string instead. Not so sure what I need to change here in order to process a xlsx file.

EDIT: Any idea how to write to an excel file using StreamingReader? I know that it is an unsupported operation, but is there a workaround?

posed1940
  • 23
  • 2
  • 6

2 Answers2

2

If you look into the following source code in github link, StreamingSheet does not support the method getPhysicalNumberOfCells(). I provide below the code snippet.

/**
   * Not supported
   */
  @Override
  public int getPhysicalNumberOfRows() {
    throw new UnsupportedOperationException();
  }

github link is given below. https://github.com/monitorjbl/excel-streaming-reader/blob/master/src/main/java/com/monitorjbl/xlsx/impl/StreamingSheet.java#L97

Sambit
  • 7,625
  • 7
  • 34
  • 65
  • Okay, so what would I use to replace this then? That's where I'm stuck at. I need to get the rows but not sure of a workaround? – posed1940 Jun 05 '19 at 15:38
  • Check this useful links. https://howtodoinjava.com/library/readingwriting-excel-files-in-java-poi-tutorial/ . http://poi.apache.org/components/spreadsheet/limitations.html – Sambit Jun 05 '19 at 15:43
  • I started using XSSF however, the file that I was processing was too large and would cause a GCOverhead Limit Exceeded error, even after adding the following options to my Tomcat run configurations: -Xmx1024m -XX:-UseGCOverheadLimit. So I started to use the StreamingReader to only store 100 rows in memory instead of loading everything, but I am stuck at this UnsupportedOperation null exception. – posed1940 Jun 05 '19 at 15:46
  • Can you provide the link of your excel sheet if it is possible, I can try to look into it. – Sambit Jun 05 '19 at 16:00
  • I am not so sure how to attach the file or upload it anywhere? Could I have a link to your github and I can attach the file that way? – posed1940 Jun 05 '19 at 16:07
  • You can create a github account and upload the excel sheet and you can share the link so that people will help you. – Sambit Jun 05 '19 at 16:08
  • with streaming apis, like excel-streaming-reader, you need to read the whole sheet - for physical row count, would it be feasible to just iterate through the rows in the sheet and get the row num of the last row? For cell count in a given row, latest excel-streaming-reader code supports that: https://github.com/monitorjbl/excel-streaming-reader/blob/master/src/main/java/com/monitorjbl/xlsx/impl/StreamingRow.java#L107 – PJ Fanning Jun 05 '19 at 17:59
  • Any idea how to write to an excel file using StreamingReader? I know that it is an unsupported operation, but is there a workaround? – posed1940 Jun 11 '19 at 17:57
  • @Vicky1940, I have got a similar problem from another user also, I am also trying to do something. Due to other works, I am unable to complete it. I am trying to achieve, once it is done, I will definitely let you know. – Sambit Jun 11 '19 at 18:13
  • Okay, I was able to read the file but now I am unable to use the write function because it is an unsupported method but I'm trying to figure out a workaround – posed1940 Jun 11 '19 at 18:18
0

We can use getLastRowNum()

Integer noOfCol = sheet.getLastRowNum(); // row no starts from 0 --- n

here is the implementation

@Override
public int getLastRowNum() {
  return reader.getLastRowNum();
}

StreamingSheet.java

Harit Kumar
  • 217
  • 2
  • 5