I am using Apache POI to create a .docx
file with the following code:
XWPFDocument document = new XWPFDocument();
XWPFParagraph paragraph = document.createParagraph();
XWPFRun run = paragraph.createRun();
run.setText(text);
String filePath = outputPathWithoutExtension + ".docx";
try {
FileOutputStream stream = new FileOutputStream(new File(filePath));
document.write(stream);
stream.close();
} catch (IOException exception) {
LOGGER.error("Could not create file '{}'", filePath);
}
and then I try to read it with the following code:
FileInputStream fileStream = new FileInputStream(filePath);
try {
XWPFDocument docx = new XWPFDocument(fileStream);
XWPFWordExtractor wordExtractor = new XWPFWordExtractor(docx);
text = wordExtractor.getText();
} catch (IOException | POIXMLException | OfficeXmlFileException
| NullPointerException exception) {
LOGGER.error("Could not load file - Exception: {}", exception.getMessage());
}
On the line where I call getText()
, it is throwing a NullPointerException
:
java.lang.NullPointerException
at org.apache.poi.xwpf.extractor.XWPFWordExtractor.extractHeaders(XWPFWordExtractor.java:162)
at org.apache.poi.xwpf.extractor.XWPFWordExtractor.getText(XWPFWordExtractor.java:87)
The issue appears to be that extractText
calls extractHeaders
with the XWPFHeaderFooterPolicy
of the document ... which in my case is null. When it tries to use it on its very first line ... boom.
I tried to create my own "header/footer policy" like so:
try {
new XWPFHeaderFooterPolicy(document);
} catch (IOException | XmlException exception) {
LOGGER.warn("Could not create output document header - "
+ "document might not be readable in all readers");
}
However, that itself throws a NullPointerException
because it tries to access the "SectPr" of the document via doc.getDocument().getBody().getSectPr()
, which returns null ... and then the first time it uses that ... boom.
So, my question is: I'm clearly not creating the XWPFDocument
correctly ... could someone set me straight?
Side note: If I open the file in Word, the file looks fine. If between the creation and reading of the file, I open it, edit it, save it, and close it, then the call to getText()
executes as expected with no NullPointerException
. Word must fill in the appropriate header/footer policy on save.