0

I have a word file. May i use jodconverter to convert into pdf. I have used jodconverter from doc to pdf and it gives very good results. But i don't know weather it support from HTML to pdf.

Thanks!

Ahmed Bilal
  • 137
  • 2
  • 11
  • Yes it does, as least the latest version (4.1.0) available in maven repo (under groupid org.jodconverter). But you said you have a word file, and then asked if jodconverter support from html (not word) to pdf. What are you trying to do exactly? – sbraconnier Nov 07 '17 at 17:33

1 Answers1

0

From what I can see about Jodconverter I understand that it uses LibreOffice or OpenOffice to export doc files to PDF. Now that software can't fully support Microsoft Office. Only Microsoft Office can fully support their format and thus use one of the above software to convert your docx files may be possible but in some cases (or in all of them) with unpleasant results. From what I know you have two options :

  1. You could turn all your docx documents to LibreOffice and then export them to PDF without compatibility issues using the LibreOffice software ( or your external java library )

  2. You could use the Microsoft Office Word to do the conversion from docx to pdf. In order to achieve that you will need a script telling the MS Word to make the conversion and call that Script from Java using Runtime.getRuntime().exec(command);. At least that's the way I know, and doing it.

Here is the script I found out years ago and still use in some cases. The script is not mine so I can't take any credits. All you need to do is to make a new file with extension .vbs and add the code below.

Option Explicit
Doc2PDF Wscript.Arguments.Item(0)
Sub Doc2PDF( myFile )
    Dim objDoc, objFile, objFSO, objWord, strFile, strPDF
   Const wdFormatDocument                    =  0
   Const wdFormatDocument97                  =  0
   Const wdFormatDocumentDefault             = 16
   Const wdFormatDOSText                     =  4
   Const wdFormatDOSTextLineBreaks           =  5
   Const wdFormatEncodedText                 =  7
   Const wdFormatFilteredHTML                = 10
   Const wdFormatFlatXML                     = 19
   Const wdFormatFlatXMLMacroEnabled         = 20
   Const wdFormatFlatXMLTemplate             = 21
   Const wdFormatFlatXMLTemplateMacroEnabled = 22
   Const wdFormatHTML                        =  8
   Const wdFormatPDF                         = 17
   Const wdFormatRTF                         =  6
   Const wdFormatTemplate                    =  1
   Const wdFormatTemplate97                  =  1
   Const wdFormatText                        =  2
   Const wdFormatTextLineBreaks              =  3
   Const wdFormatUnicodeText                 =  7
   Const wdFormatWebArchive                  =  9
   Const wdFormatXML                         = 11
   Const wdFormatXMLDocument                 = 12
   Const wdFormatXMLDocumentMacroEnabled     = 13
   Const wdFormatXMLTemplate                 = 14
   Const wdFormatXMLTemplateMacroEnabled     = 15
   Const wdFormatXPS                         = 18
   Const wdFormatOfficeDocumentTemplate      = 23
   Const wdFormatMediaWiki                   = 24 
   Set objFSO = CreateObject( "Scripting.FileSystemObject" )
   Set objWord = CreateObject( "Word.Application" )
   With objWord
       .Visible = false
       If objFSO.FileExists( myFile ) Then
           Set objFile = objFSO.GetFile( myFile )
           strFile = objFile.Path
       Else
           WScript.Echo "FILE OPEN ERROR: The file does not exist" & vbCrLf
           .Quit
           Exit Sub
       End If
       strPDF = objFSO.BuildPath( objFile.ParentFolder, _
                objFSO.GetBaseName( objFile ) & ".pdf" )
       .Documents.Open strFile
       Set objDoc = .ActiveDocument
     objDoc.SaveAs strPDF, wdFormatPDF
      objDoc.Close
       .Quit
   End With
End Sub

Here is a small example with the method you could use to export docx files to PDF :

private void convertFile(String text) {
    File f = new File(text);

    if (f.exists() && f.getName().endsWith(".docx")) {
        // The full path of your script, in my case is the Desktop.
        String converterLocation = System.getProperty("user.home") + "\\Desktop\\wordToPdf.vbs";

        // wscript <converterLocation> <"fullFilePath">
        String command = "wscript " + converterLocation + " \"" + f.getAbsolutePath();

        try {
            Process p = Runtime.getRuntime().exec(command);
            p.waitFor();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

I know that this answer is not what you expect but I just want to point out a solution of your problem that i am aware of.

JKostikiadis
  • 2,847
  • 2
  • 22
  • 34
  • You may take a look at [documents4j](http://documents4j.com/#/) if you need a library that uses Microsoft Office Word for better MSWord document conversion. – sbraconnier Nov 07 '17 at 17:39
  • @sbraconnier I was working with some complex MS documents especially with Excels and their conversion always ends up horrible, I will give it a look for sure. – JKostikiadis Nov 07 '17 at 17:45