A *.docx
file is simply a ZIP
archive containing multiple XML
files and other files too. So after XWPFDocument.write
the result, either a file or bytes, can be handled as such, unzipped and looked at /word/document.xml
for example.
But if one wants avoid writing out the whole document, then one needs to know that XWPFDocument
internally bases on org.openxmlformats.schemas.wordprocessingml.x2006.main.CT*
objects which all extend org.apache.xmlbeans.XmlObject
. And XmlObject.toString()
returns the XML
as String
. For the document XML
, XWPFDocument.getDocument
returns a org.openxmlformats.schemas.wordprocessingml.x2006.main.CTDocument1
which is the representaton of /word/document.xml
.
So System.out.println(docx.getDocument().toString());
will print the XML of the underlying CTDocument1
.
Unfortunately org.apache.xmlbeans.XmlObject
only represents the contents of an element or attribute, not the element or attribute itself. So when you validate or save an XmlObject
, you are validating or saving its contents, not its container. For CTDocument1
that means, it contains the body elements but not the document container itself. To get the document container itself as an XmlObject
one needs a org.openxmlformats.schemas.wordprocessingml.x2006.main.DocumentDocument
object which contains the CTDocument1
.
Example for print document XML
from XWPFDocument
:
import java.io.FileOutputStream;
import org.apache.poi.xwpf.usermodel.*;
public class CreateXWPFDocumentDumpDocumentXML {
static void printDocumentXML(XWPFDocument docx) throws Exception {
String xml;
System.out.println("Contents of org.openxmlformats.schemas.wordprocessingml.x2006.main.CTDocument1:");
org.apache.xmlbeans.XmlObject documentXmlObject = docx.getDocument();
xml = documentXmlObject.toString();
System.out.println(xml);
System.out.println("Contents of whole DocumentDocument:");
org.openxmlformats.schemas.wordprocessingml.x2006.main.CTDocument1 ctDocument1 = docx.getDocument();
org.openxmlformats.schemas.wordprocessingml.x2006.main.DocumentDocument documentDocument = org.openxmlformats.schemas.wordprocessingml.x2006.main.DocumentDocument.Factory.newInstance();
documentDocument.setDocument(ctDocument1);
xml = documentDocument.toString();
System.out.println(xml);
}
public static void main(String[] args) throws Exception {
XWPFDocument docx = new XWPFDocument();
XWPFParagraph paragraph = docx.createParagraph();
XWPFRun run=paragraph.createRun();
run.setBold(true);
run.setFontSize(22);
run.setText("The paragraph content ...");
paragraph = docx.createParagraph();
printDocumentXML(docx);
try (FileOutputStream out = new FileOutputStream("./XWPFDocument.docx")) {
docx.write(out);
}
}
}