0

Use case: I have a cost calculator implemented in excel. I have to insert a huge number of inputs to be entered into the input cells in a sheet and get the outputs from the same sheet. I have 3 excel sheets like that which are ~3MB in size. The excel sheets are in .xlsm formats.

Technologies used: Java 1.8 and Apache POI 4.0.1

Problem: Writing input values to the excel files that are >3MB is fast (takes ~10 sec for around 12 inputs). But the same action on excel file that is <3MB too slow (~10 sec for only one input).

I have seem that some people here say to use SXSSFWorkbook. I replaced the XSSFWorkbook with that, but it is giving me Null Pointer Exception. Also, this is used for large excel sheets that are above 100MB, right?

Code for writing values:

    public void setData1(String filePath, String sheetName, int rowNum, int colNum, Object data) throws IOException {
        File file = new File(filePath); 
        FileInputStream fis = null;
        try {
            fis = new FileInputStream(file);
        } catch (FileNotFoundException e) {
            Messages.terminate(file.getName() + " file is missing.");
        }
        Workbook workbook = new XSSFWorkbook(fis);
        Sheet sheet = workbook.getSheet(sheetName);
        Row row = sheet.getRow(rowNum);
        Cell cell = row.getCell(colNum);
        if(data instanceof Integer) {
            cell.setCellValue((int)data);
        }
        else if (data instanceof String) {
            cell.setCellValue(String.valueOf((String) data));
        }
        else if(data instanceof Double) {
            if((double)data == Math.floor((double)data)) {
                cell.setCellValue((int)Math.floor((double)data));
            }
            else {
                cell.setCellValue((double)data);
            }
        }
        FileOutputStream fos = new FileOutputStream(new File(filePath));
        BaseFormulaEvaluator.evaluateAllFormulaCells(workbook);
        workbook.write(fos);
        workbook.close();
        fos.close();
        fis.close();
    }
  • 1
    Your question is not clear. It is about "delay in writing values into a excel file" but your code is only reading one value from a special file, sheet and cell. If that takes 10 seconds for one call, then your environment is very poor (very less memory available). But one hint: If this method is made to read multiple cell values out of the same file, then it should not new creating the workbook from that file every time. Instead the workbook should be a class member which gets created only once. – Axel Richter Apr 28 '19 at 05:13
  • Hi. Thanks for the reply. I have changed the code that was present in the question. Please review it. Also, I get the point that File should be created once. But I uses 2 different excel sheets. On acts as a Input File. The other one is where the calculations are done, basically a calculator. The same code works fine for 2 calculator files that's slightly bigger than the one I'm facing the issue with. If the issues was with my system config, the program should be slow for all cases, right? And thanks for suggestions!!! – Santhosh J Apr 28 '19 at 15:52
  • 1
    Both the workbooks should be class members then and only created once and not every time a cell value needs to be read or write. Also the evaluating all formula cells and then writing the workbook should only be done once after changing all cell values and not for each single changed cell value. – Axel Richter Apr 28 '19 at 16:00
  • The code you have provided is what you would call to copy a single cell from Workbook A to Workbook B. You need to open both workbooks from wherever you are calling this method, not in the method itself. – Cubius Apr 28 '19 at 22:53
  • @AxelRichter I'll change them as class members. Evaluating formulas works only if I evaluate just after writing them, else they are not evaluated. – Santhosh J Apr 29 '19 at 06:43
  • @Cubius What is drawbacks of this implementation? Also, it'd fine if you can explain with some sample code. – Santhosh J Apr 29 '19 at 06:45
  • "Evaluating formulas works only if I evaluate just after writing them, else they are not evaluated.": This is not true. The method `evaluateAllFormulaCells` evaluates **all** formula cells in the **whole** workbook using current values of the referenced cells. If that are hundreds of formula cells then this will take time. So it never should be done after each changing a cell value, else each changig a cell value will take that big amount of time. – Axel Richter Apr 29 '19 at 09:42
  • Maybe you only wants evaluating **one** special formula cell? Or only **some** special formula cells? If so do using [FormulaEvaluator.evaluate(Cell cell)](https://poi.apache.org/apidocs/dev/org/apache/poi/ss/usermodel/FormulaEvaluator.html#evaluate-org.apache.poi.ss.usermodel.Cell-). – Axel Richter Apr 29 '19 at 09:42
  • "Maybe you only wants evaluating one special formula cell? Or only some special formula cells? If so do using FormulaEvaluator.evaluate(Cell cell).": No, actually there are a number of cells that gets evaluated. These cells are dependent upon other formula cells. And Majority of the cells are formulas, hence the best option is to evaluate all. – Santhosh J Apr 29 '19 at 09:55

1 Answers1

0

The following code opens the workbook only once instead of opening and closing it repeatedly for every value that is entered:

public void setData1(Workbook workbook, FileOutputStream fos, String filePath, String sheetName, int rowNum, int colNum, Object data) throws IOException {
    Sheet sheet = workbook.getSheet(sheetName);
    Row row = sheet.getRow(rowNum);
    Cell cell = row.getCell(colNum);
    if(data instanceof Integer) {
        cell.setCellValue((int)data);
    }
    else if (data instanceof String) {
        cell.setCellValue(String.valueOf((String) data));
    }
    else if(data instanceof Double) {
        if((double)data == Math.floor((double)data)) {
            cell.setCellValue((int)Math.floor((double)data));
        }
        else {
            cell.setCellValue((double)data);
        }
    }
    BaseFormulaEvaluator.evaluateAllFormulaCells(workbook); //I don't have enough context here, but if it does not cause any logical problems, move this line to *
    workbook.write(fos);
}

public void callsSetData1() {
    object Data = new Object()
    File file = new File(filePath); 
    FileInputStream fis = null;
    try {
        fis = new FileInputStream(file);
    } catch (FileNotFoundException e) {
        Messages.terminate(file.getName() + " file is missing.");
    }
    Workbook workbook = new XSSFWorkbook(fis);
    FileOutputStream fos = new FileOutputStream(new File(filePath));
    //*
    for(i = 0; i < 10; i++) {
        setData1(workbook, fos, "R:\andom\File\Path", "randomSheetName",0,0, data); //I'm assuming you are calling setData1() multiple times, as I do not have the code of the method that calls it, I've just used a for-loop for now
    }
    workbook.close();
    fos.close();
    fis.close();
}

I recommend you also provide the code of the method where you call setData1. As I do not have that, callsSetData1 might contain a few logical errors.

The point is that you should be opening and closing the workbook only once in the method that calls setData1 and the pass it over as a parameter. This should help with delay.

The only minor drawback is that you will have a few more lines of code if you call setData1 in multiple different methods.

Cubius
  • 91
  • 5