2

My question is very specific and I hope that someone has done this conversion from HTMLto DOCX.

To do this I took a sample code from github and tried it in my local Eclipse Setup.

import java.io.File;
import java.io.FileNotFoundException;

import javax.xml.bind.JAXBException;

import org.docx4j.convert.in.xhtml.XHTMLImporterImpl;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.openpackaging.exceptions.InvalidFormatException;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.WordprocessingML.NumberingDefinitionsPart;

public class HtmlToDocConvert {

    /**
     * @param args
     * @throws FileNotFoundException
     * @throws JAXBException
     * @throws Docx4JException
     */
    public static void main(String[] args) throws FileNotFoundException,
            JAXBException, Docx4JException {
        // TODO Auto-generated method stub

        // File file = new File("C:\\TestWordToHtml\\html\\Test.html");

        String inputfilepath = "C:\\TestWordToHtml\\html\\Test.html";

        try {

            WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
                    .createPackage();

            NumberingDefinitionsPart ndp = new NumberingDefinitionsPart();
            wordMLPackage.getMainDocumentPart().addTargetPart(ndp);
            ndp.unmarshalDefaultNumbering();

            XHTMLImporterImpl xHTMLImporter = new XHTMLImporterImpl(
                    wordMLPackage);
            xHTMLImporter.setHyperlinkStyle("Hyperlink");
            wordMLPackage.getMainDocumentPart().getContent().addAll(
                    xHTMLImporter.convert(new File(inputfilepath), null));

            File output = new java.io.File(System.getProperty("user.dir")
                    + "/html_output.docx");
            wordMLPackage.save(output);
            System.out.println("done");

            System.out.println("file path where it is stored is" + " "
                    + output.getAbsolutePath());

        }

        catch (InvalidFormatException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }

}

Above code is giving me an error as follows

Exception in thread "main" java.lang.NoSuchMethodError: org.docx4j.org.xhtmlrenderer.docx.DocxRenderer.(Ljava/lang/String;)V at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.getRenderer(XHTMLImporterImpl.java:252) at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.convert(XHTMLImporterImpl.java:466) at HtmlToDocConvert.main(HtmlToDocConvert.java:41)

Jars in my projects to achieve this are as following.

docx4j-3.2.1.jar
docx4j-ImportXHTML-3.2.1.jar
slf4j-api-1.7.7.jar
slf4j-log4j12-1.7.7.jar
xhtmlrenderer-1.0.0.jar
log4j.jar

I have stripped the xhtmlrendere.jar file to view DOCRendered class and saw that there was no init method inside it.I have spent close to half a day to figure out this thing and I am not sure if this is correct way to do the conversion or this is even possible.

If someone has done this can he/she sent me correct xhtmlrenderer.jar file or anypother dependency to achieve this simple task.

Thanks in Advance

Regards, Bhanu

SonalPM
  • 1,317
  • 8
  • 17
MrWayne
  • 309
  • 1
  • 9
  • 20
  • possible duplicate of [Convert html to doc in java](http://stackoverflow.com/questions/5403356/convert-html-to-doc-in-java) – Alex Tape Oct 10 '14 at 11:01

3 Answers3

7

This is not the complete example, is it? Just take a look at ConvertInXHTMLFile.java from docx4j examples.

IMHO you are missing basic parts of the procedure. Furthermore, this topic has been discussed already:

Convert html to doc in java

How to convert HTML to a Microsoft Word document ?

Convert HTML to Microsoft Word Document in Java

how to convert HTML to .docx using docx4j?

Community
  • 1
  • 1
Alex Tape
  • 2,291
  • 4
  • 26
  • 36
  • In case it is not clear, you can find the correct xhtmlrenderer-3.0.0.jar in http://www.docx4java.org/docx4j/docx4j-3_2_0/optional/ImportXHTML/ or via Maven https://github.com/plutext/docx4j-ImportXHTML/blob/master/pom.xml – JasonPlutext Oct 10 '14 at 19:48
  • i was able to get this working after putting latest xhtmlrender-3.0.0. jar in build path. – MrWayne Jan 15 '15 at 12:44
0

check code here. Api used is docx4j-ImportXHTML. Code is simple to follow. Just pass on your xhtml to api as in code and it will do the needful.

Community
  • 1
  • 1
nanosoft
  • 2,913
  • 4
  • 41
  • 61
0

I had the same problem, Replace your xhtmlrenderer-1.0.0 jar file with version 3.0.0 . This is Maven Repository link