4

I want convert a docx to pdf with apache poi, the docx is generated correctly with docx4j. The conversion work fine with simple document, but when I want to convert a more stylized document, POI throws an exception:

org.apache.xmlbeans.impl.values.XmlValueOutOfRangeException: union value '0000FF">http://schemas.openxmlformats.org/wordprocessingml/2006/main' 15:09:20 org.apache.poi.xwpf.converter.core.XWPFConverterException: org.apache.xmlbeans.impl.values.XmlValueOutOfRangeException: union value '0000FF">http://schemas.openxmlformats.org/wordprocessingml/2006/main' at org.apache.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:70) ~[org.apache.poi.xwpf.converter.pdf-1.0.6.jar:1.0.6]

There is the cause of this exception:

<w:r>
    <w:rPr>
        <w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
        <w:color w:val="0000FF"><span style="background-color: rgb(51, 153, 102);"><span style="background-color: rgb(255, 0, 0);"><font color="99CC00"/>
        <w:sz w:val="20"/>
        <w:szCs w:val="20"/>
        <w:highlight w:val="red"/>
    </w:rPr>
    <w:t xml:space="preserve">Juillet-Aout</w:t>
</w:r>

Screen of my document

And this is my code:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;


import org.apache.poi.xwpf.usermodel.XWPFDocument;

import fr.opensagres.poi.xwpf.converter.pdf.PdfConverter;
import fr.opensagres.poi.xwpf.converter.pdf.PdfOptions;

public class ConvertDocxPdf
{

    public static void main( String[] args )
    {
        long startTime = System.currentTimeMillis();

        try
        {
            // 1) Load docx with POI XWPFDocument
            InputStream source = new FileInputStream("test.docx");
            XWPFDocument document = new XWPFDocument(source);

            // 2) Convert POI XWPFDocument 2 PDF with iText
            File outFile = new File("result.pdf" );
            outFile.getParentFile().mkdirs();

            OutputStream out = new FileOutputStream( outFile );
            PdfOptions options = null;// PDFViaITextOptions.create().fontEncoding( "windows-1250" );
            PdfConverter.getInstance().convert( document, out, options );
        }
        catch ( Throwable e )
        {
            e.printStackTrace();
        }

        System.out.println( "Generate DocxStructures.pdf with " + ( System.currentTimeMillis() - startTime ) + " ms." );
    }
}

And this is the XML line which cause problem:

<w:r>
    <w:rPr>
        <w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
        <w:color w:val="0000FF"><span style="background-color: rgb(51, 153, 102);"><span style="background-color: rgb(255, 0, 0);"><font color="99CC00"/> //<-- That line
        <w:sz w:val="20"/>
        <w:szCs w:val="20"/>
        <w:highlight w:val="red"/>
    </w:rPr>
    <w:t xml:space="preserve">Juillet-Aout </w:t>
</w:r>
Clement Cuvillier
  • 227
  • 1
  • 7
  • 24

1 Answers1

0

I had trouble finding updated pre-built jars at the XDocReport websites and repositories. I ended up Google searching for the specific version number I was looking for, and found it at https://mvnrepository.com/artifact/fr.opensagres.xdocreport/fr.opensagres.poi.xwpf.converter.pdf/2.0.1

I'm not sure if this really answers the question, but it does answer the related question of how to get an updated version of the library. Building from source is probably more safe though.

speckledcarp
  • 6,196
  • 4
  • 22
  • 22