Convert HTML code to Word Using docx4j Image is not embedded in Word document

Question

Sample program...

import java.io.IOException;
import org.docx4j.Docx4jProperties;
import org.docx4j.jaxb.Context;
import org.docx4j.openpackaging.contenttype.ContentType;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.model.structure.PageSizePaper;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.PartName;
import org.docx4j.openpackaging.parts.WordprocessingML.AlternativeFormatInputPart;
import org.docx4j.relationships.Relationship;
import org.docx4j.wml.CTAltChunk;

public class HtmlToDoc {
    public static void main(String[] args) throws Docx4JException {
        String html="", s="", filepath="E://HtmlToDoc//";

        try {

            String html = "<html><head><title>Import me</title></head><body><p>Hello World! Sample Program</p><img src="E:/HtmlToDoc/LOGO.JPEG"/></body></html>";

            Docx4jProperties.getProperties().setProperty("docx4j.PageSize", "B4JIS");
            String papersize= Docx4jProperties.getProperties().getProperty("docx4j.PageSize", "B4JIS");
            String landscapeString = Docx4jProperties.getProperties().getProperty("docx4j.PageOrientationLandscape", "true");
            boolean landscape= Boolean.parseBoolean(landscapeString);

            WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage(PageSizePaper.valueOf(papersize), landscape);
            AlternativeFormatInputPart afiPart = new AlternativeFormatInputPart(new PartName("/hw.html"));

            afiPart.setBinaryData(html.getBytes());
            //afiPart.setBinaryData(fileContent);

            afiPart.setContentType(new ContentType("text/html"));
            Relationship altChunkRel = wordMLPackage.getMainDocumentPart().addTargetPart(afiPart);

            // .. the bit in document body
            CTAltChunk ac = Context.getWmlObjectFactory().createCTAltChunk();
            ac.setId(altChunkRel.getId() );
            wordMLPackage.getMainDocumentPart().addObject(ac);

            // .. content type
            wordMLPackage.getContentTypeManager().addDefaultContentType("html", "text/html");
            wordMLPackage.save(new java.io.File("E://HtmlToDoc//" + "test.docx"));

        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } 
    }

}

This is working correctly in my local machine. but i moved this code to server in my word document image is not embedded but i gave correct image path [The same image path is working fine when i am converting HTML to PDF in server]. what could be the reason image is missing while running in server[linux machine and IBM websphere App Server and ApacheWeb server]. Even though all my paths(word document, image, html document) are same.

score 1 · Answer 1 · answered Dec 26 '14 at 08:11

1

Your code relies on Word to convert the altChunk to HTML, so, if you are opening the Word document on your local machine, its not going to be able to see an image at E:/HtmlToDoc/LOGO.JPEG on the server.

You could possibly use a URL, or a data URI.

Alternatively, use docx4j-ImportXHTML, which will do the conversion without leaving anything to Word.

answered Dec 26 '14 at 08:11

JasonPlutext

15,352
4
44
84

i used relative path image is embedded in word but when i am downloading image is missing but when i copied the word document from server to local image is there in word document. I tried another method data URI i am getting small cross mark instead of image. where i am doing wrong code. – Vijay Marudhachalam Dec 28 '14 at 11:12

Convert HTML code to Word Using docx4j Image is not embedded in Word document

1 Answers1