0

I am creating a Groovy script to export tables from an .xlsm file to a .csv file including formulas when appropriate (rather than generated data). When the script calls .getCellType() on the current cell I get a null pointer exception, even though this functionality occurs within an if statement that tests whether the cell is null.

I have tried replacing the

if(cell != null) 

condition with

if(cell.getCellType() != CellType.BLANK)

to no avail.

The code and full error message are below.

    #!/usr/bin/env groovy

    @Grab(group = 'org.apache.poi', module = 'poi', version = '4.1.0')
    @Grab(group = 'org.apache.poi', module = 'poi-ooxml', version = '4.1.0')

    import org.apache.poi.xssf.usermodel.XSSFWorkbook
    import org.apache.poi.xssf.usermodel.*
    import org.apache.poi.ss.usermodel.*


    Workbook wb = new XSSFWorkbook("input.xlsm")

    FormulaEvaluator fe = wb.getCreationHelper().createFormulaEvaluator()

    DataFormatter formatter = new DataFormatter()
    PrintStream out = new PrintStream(new FileOutputStream("output.csv"), true, "UTF-8")
    byte[] bom = [(byte)0xEF, (byte)0xBB, (byte)0xBF]
    out.write(bom)

    for (Sheet sheet : wb){
        for (Row row : sheet) {
            boolean firstCell = true
            for(Cell cell : row){
                if (! firstCell) out.print(',')
                if ( cell != null ) {
                    if (fe != null) cell = fe.evaluateInCell()
                    String value = formatter.formatCellValue(cell)
                    if (cell.getCellType() == CellType.FORMULA) {
                        value = "=" + value
                    }
                    out.print(value)
                }

                firstCell = false
            }
            out.println()
        }
    }

Error Message:

Caught: java.lang.NullPointerException: Cannot invoke method 
getCellType() on null object java.lang.NullPointerException: 
Cannot invoke method getCellType() on null object
at ExcelToCSV.run(ExcelToCSV.groovy:28)

My expectation is that in the case that the current cell is not null, if the cell is evaluated to contain a formula, the string output to the .csv file will have an "=" appended to the beginning of it, otherwise it will simply output the string representing the value within the cell.

I am unfortunately having problems with my IDE skipping over breakpoints and currently I am unable to step through the code or view the variable values, which is a separate issue I am also working on. Until I get that resolved, I am hoping someone is able to point out what I might be missing.

Olaf Kock
  • 46,930
  • 8
  • 59
  • 90
Blake Simmons
  • 426
  • 1
  • 8
  • 23
  • You need to check if the cell is null *before* calling any methods on it, eg `getCellType` – Gagravarr Jun 07 '19 at 17:04
  • Yes, this is what I'm doing within the if statement where the method call resides, or am I missing something? As I said its been a challenge trying to debug without being able to set breakpoints. – Blake Simmons Jun 07 '19 at 18:03
  • 2
    Why using [FormulaEvaluator.evaluateInCell](https://poi.apache.org/apidocs/dev/org/apache/poi/ss/usermodel/FormulaEvaluator.html#evaluateInCell-org.apache.poi.ss.usermodel.Cell-)? Most times this is not what one would want. Remove the line `if (fe != null) cell = fe.evaluateInCell()` Then do `String value = formatter.formatCellValue(cell, fe)` instead. And, if you want the formula string with leading `=`, then `if (cell.getCellType() == CellType.FORMULA) value = "=" + cell.getCellFormula()` – Axel Richter Jun 08 '19 at 04:49

1 Answers1

1

You are assigning to cell after the null check, but before you call a cell method. That assignment must be a null value.

evaluateInCell takes an argument of type Cell, so if you replace your line

if (fe != null) cell = fe.evaluateInCell()
// cell == null

with

if (fe != null) cell = fe.evaluateInCell(cell)

then you get what the javadocs say you should get, which is the unchanged cell for simple values or the formula result for formulas.

I think you will also find that the Cell type will change to reflect the type of the value returned by the formula evaluation, so your test for CellType.FORMULA will always be false.

npskirk
  • 1,188
  • 1
  • 8
  • 21