3

I am using Apache POI 3.13 and was trying to search and replace texts from a given template file then saving a new generated .docx. Here's my code:

public static void main(String[] args) throws InvalidFormatException, IOException {
    String filePath = "Sample.docx";
    File outputfile = new File("SampleProcessed.docx");

    XWPFDocument doc = new XWPFDocument(OPCPackage.open(filePath));

    for (XWPFParagraph p : doc.getParagraphs()) {
        List<XWPFRun> runs = p.getRuns();
        if (runs != null) {
            for (XWPFRun r : runs) {
                String text = r.getText(0);
                if (text != null && text.contains("$VAR")) {
                    text = text.replace("$VAR", "JohnDoe");
                    r.setText(text, 0);
                }
            }
        }
    }

    doc.write(new FileOutputStream(outputfile));
    doc.close();
    System.out.println("Done");
    Desktop.getDesktop().open(outputfile);
}

This looks pretty straightforward but when I run this code, the document "Sample.docx" also get replaced. In the end I am having two documents with identical contents.

Is this the normal behavior of POI? I thought opening the document only loads it into memory, then doing the 'doc.write(OutputStream);' would flush it to disk.

I tried writing to the same 'filePath' but as expected it throws an exception since I'm trying to write to a currently open file.

The only thing that worked was when I copied the template file first and used that copy instead. But then now, I have 3 files, the first one was the original template 'Sample.docx' and the remaining 2 has the same content (SampleProcessed.docx and SampleProcessedOut.docx).

It worked but It's pretty wasteful. Is there any way to this? Am I doing something wrong, perhaps am I opening the word document wrong?

morido
  • 1,027
  • 7
  • 24
yev
  • 512
  • 4
  • 14

1 Answers1

6

Since you are using

XWPFDocument doc = new XWPFDocument(OPCPackage.open(filePath));

to create the XWPFDocument, a OPCPackage is opened from the filePath in READ_WRITE mode. If this will be closed, it will also be saved. See https://poi.apache.org/apidocs/org/apache/poi/openxml4j/opc/OPCPackage.html#close%28%29.

The OPCPackage will be closed while the XWPFDocument will be closed.

But why you do so? Why not

XWPFDocument doc = new XWPFDocument(new FileInputStream(filePath));

?

With this the XWPFDocument will be created in memory only with a new OPCPackage without relations to a file.

Axel Richter
  • 56,077
  • 6
  • 60
  • 87
  • This worked! Thanks! I missed that. Sorry, I should've read the documentation a beforehand. I thought the OPCPackage was the only way to open a .docx file. Thanks again! – yev Feb 14 '16 at 08:32
  • There is also a version of open() which has a parameter PackageAccess where you can specify READ as open-mode and thus also avoid the writing back of the data, see https://poi.apache.org/apidocs/org/apache/poi/openxml4j/opc/OPCPackage.html#open(java.io.File,%20org.apache.poi.openxml4j.opc.PackageAccess) – centic Feb 15 '16 at 06:42
  • @centic: Have you tried this? How will you change text within `Run`s if the `OPCPackage` is opened read only? – Axel Richter Feb 15 '16 at 16:08