4

We have an excel file that is pre-created with fancy charts and formulae that cross-reference sheets. It also contains macros. Source data is supposed to be copy-pasted into one sheet and the rest of workbook populates itself from pre-existing wiring.

Note: I really don't care about reading / modifying other sheets in this workbook, I just need to save the pains of copy-pasting raw data every time to this data sheet in this workbook.

I'm using Apache POI and trying to create a XSSFWorkbook instance from this "template" excel file. However, it takes a long time (almost a minute consistently across many runs) for this object to be constructed. The excel file itself is just around 400KB, not a big file. I profiled this using jconsole, it seems like it is not limited by either CPU or Heap - It uses just around 90 MB heap memory (I had started it up with 2GB committed heap) and around 52% CPU.

It takes a very short time to actually populate the data sheet with raw data and write the final updated file out (roughly 3-4 seconds). Here is my startup code:

public static void startup() throws FileNotFoundException, IOException {
    long start = System.nanoTime();
    System.out.println("Started...");
    TEMPLATE_WORKBOOK = new XSSFWorkbook(new FileInputStream(new File(TEMPLATE)));
    long end = (System.nanoTime() - start) / NANOS;
    System.out.println("It took " + end + " seconds..");
}

I thought about loading up this TEMPLATE_WORKBOOK once and then reuse the same handle to write new data for every subsequent request - I simulated this with a sleep and a forever-while in my main class. But I can't do this apparently, I got an exception "Exception in thread "main" org.apache.xmlbeans.impl.values.XmlValueDisconnectedException". The TEMPLATE_WORKBOOK object is not re-usable.

I do see that there is an event-based API, but before I get into it, I wanted to see if I'm missing something here! Again, memory / CPU is not an issue here, we have plenty of heap to spare. I'm trying to reduce time.

P.S: I tried the tips here: XSSFWorkbook takes a lot of time to load - they didn't help the startup time.

Community
  • 1
  • 1
lramakri
  • 56
  • 1
  • 4

1 Answers1

1

I think you are in a bad position for what you want to do. I had the same kind of problem (see here) and unfortunately there is no solution in your case.

Even if you want to just right in your data sheet, the only way you can do it is openning your file as an XSSFWorkbook, even in event-based API.

The SXSSFWorkbook can't be constructed from a File, InputStream or an OPCPackage. You can only build it from scratch or from an XSSFWorkbook.

The only really aweful time-consumming possible solution (I don't know if it is possible) is to try to input the data with the reading parser, meaning that you should parse the excel file into xml yourself.

Our final solution is waiting that the solution to our problem will be solved.

Sorry that you couldn't have the answer that you wanted.

Community
  • 1
  • 1
ArtiBucco
  • 2,199
  • 1
  • 20
  • 26