0

The following is the simple code snippet which I am trying to use to convert a DOCX file into a PDF on my local system, which is as it is picked up from http://documents4j.com/#/.

  1. Running this piece of code as a part of service inside a web app (.war) deployed on Tomcat 9.0.37 which is running as Windows service.
  2. Created the below folder C:\Windows\SysWOW64\config\systemprofile\Desktop (Running documents4j application as Windows service).
  3. MS Office professional 2016 is installed (licenced copy).

Well with same configuration on VM 1: Windows server 2012 and VM 2: Windows server 2019 as mentioned in above three points. I'm able to convert the DOCX to PDF in my VM 1 : Windows server 2012. But the below problem I'm facing in VM 2: Windows server 2019.

On calling execute method the current thread is getting blocked for forever, which leads to memory leak. Please find the below error on execute method.

org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [appName] is still processing a request that has yet to finish. This is very likely to create a memory leak. You can control the time allowed for requests to finish by using the unloadDelay attribute of the standard Context implementation. Stack trace of request processing thread:[
 java.base@11.0.5/jdk.internal.misc.Unsafe.park(Native Method)
 java.base@11.0.5/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
 java.base@11.0.5/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
 java.base@11.0.5/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039)
 java.base@11.0.5/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345)
 java.base@11.0.5/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:232)
 com.documents4j.job.AbstractFutureWrappingPriorityFuture.get(AbstractFutureWrappingPriorityFuture.java:204)
 com.documents4j.job.AbstractFutureWrappingPriorityFuture.get(AbstractFutureWrappingPriorityFuture.java:10)
 com.documents4j.job.ConversionJobAdapter.execute(ConversionJobAdapter.java:13)

Code Snippet 1:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import com.documents4j.api.DocumentType;
import com.documents4j.api.IConverter;
import com.documents4j.job.LocalConverter;

public class WordMLToPDF2 {

    public static void main(String[] args) {

        File inputWord  = new File("C:\\Users\\anshu\\Downloads\\test.docx")
        , outputFile  = new File("C:\\Users\\anshu\\Downloads\\test.pdf");
        try  {
            InputStream docxInputStream = new FileInputStream(inputWord);
            OutputStream outputStream = new FileOutputStream(outputFile);
            IConverter converter = LocalConverter.builder()
                       .baseFolder(tmpDir)
                       .workerPool(20, 25, 2, TimeUnit.SECONDS)
                       .processTimeout(30, TimeUnit.SECONDS)
                       .build();
            converter.convert(docxInputStream).as(DocumentType.DOCX).to(outputStream).as(DocumentType.PDF).execute();
            outputStream.close();
            System.out.println("success");
            converter.shutDown();
        } catch (Exception e) {
            e.printStackTrace();
        }
        
    }
    
}

Below dependencies which I've configured in pom.xml

   <dependencies>
        <dependency>
            <groupId>org.docx4j</groupId>
            <artifactId>docx4j-documents4j-local</artifactId>
            <version>8.3.3</version>
        </dependency>
        <dependency>
            <groupId>org.docx4j</groupId>
            <artifactId>docx4j-JAXB-ReferenceImpl</artifactId>
            <version>8.3.3</version>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-slf4j-impl</artifactId>
            <version>2.17.1</version>
        </dependency>
        <!-- API, java.xml.bind module -->
        <dependency>
            <groupId>jakarta.xml.bind</groupId>
            <artifactId>jakarta.xml.bind-api</artifactId>
            <version>2.3.2</version>
        </dependency>

        <!-- Runtime, com.sun.xml.bind module -->
        <dependency>
            <groupId>org.glassfish.jaxb</groupId>
            <artifactId>jaxb-runtime</artifactId>
            <version>2.3.2</version>
        </dependency>
    </dependencies>

EDIT:

After analyzing the source code. What I understood is:

  • Internally documents4j executes a VBScript with the help of another library zt-exec (ZeroTurnaround Process Executor) to start the WINWORD or for the conversion of DOCX to PDF or to stop the WINWORD

  • WINWORD program freezes When it executes wordApplication.Documents.Open(inputFile, False, True, False) word_convert

  • In above statement wordApplication holds the object of Word.Application

    word_start word_convert word_shutdwon

Geert Bellekens
  • 12,788
  • 2
  • 23
  • 50
Anshu
  • 1
  • 2
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/251473/discussion-on-question-by-anshu-convert-docx-to-pdf-with-documents4j-library). – Machavity Jan 29 '23 at 13:54

0 Answers0