The following is the simple code snippet which I am trying to use to convert a DOCX file into a PDF on my local system, which is as it is picked up from http://documents4j.com/#/.
- Running this piece of code as a part of service inside a web app (.war) deployed on Tomcat 9.0.37 which is running as Windows service.
- Created the below folder C:\Windows\SysWOW64\config\systemprofile\Desktop (Running documents4j application as Windows service).
- MS Office professional 2016 is installed (licenced copy).
Well with same configuration on VM 1: Windows server 2012 and VM 2: Windows server 2019 as mentioned in above three points. I'm able to convert the DOCX to PDF in my VM 1 : Windows server 2012. But the below problem I'm facing in VM 2: Windows server 2019.
On calling execute method the current thread is getting blocked for forever, which leads to memory leak. Please find the below error on execute method.
org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [appName] is still processing a request that has yet to finish. This is very likely to create a memory leak. You can control the time allowed for requests to finish by using the unloadDelay attribute of the standard Context implementation. Stack trace of request processing thread:[
java.base@11.0.5/jdk.internal.misc.Unsafe.park(Native Method)
java.base@11.0.5/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
java.base@11.0.5/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
java.base@11.0.5/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039)
java.base@11.0.5/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345)
java.base@11.0.5/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:232)
com.documents4j.job.AbstractFutureWrappingPriorityFuture.get(AbstractFutureWrappingPriorityFuture.java:204)
com.documents4j.job.AbstractFutureWrappingPriorityFuture.get(AbstractFutureWrappingPriorityFuture.java:10)
com.documents4j.job.ConversionJobAdapter.execute(ConversionJobAdapter.java:13)
Code Snippet 1:
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import com.documents4j.api.DocumentType;
import com.documents4j.api.IConverter;
import com.documents4j.job.LocalConverter;
public class WordMLToPDF2 {
public static void main(String[] args) {
File inputWord = new File("C:\\Users\\anshu\\Downloads\\test.docx")
, outputFile = new File("C:\\Users\\anshu\\Downloads\\test.pdf");
try {
InputStream docxInputStream = new FileInputStream(inputWord);
OutputStream outputStream = new FileOutputStream(outputFile);
IConverter converter = LocalConverter.builder()
.baseFolder(tmpDir)
.workerPool(20, 25, 2, TimeUnit.SECONDS)
.processTimeout(30, TimeUnit.SECONDS)
.build();
converter.convert(docxInputStream).as(DocumentType.DOCX).to(outputStream).as(DocumentType.PDF).execute();
outputStream.close();
System.out.println("success");
converter.shutDown();
} catch (Exception e) {
e.printStackTrace();
}
}
}
Below dependencies which I've configured in pom.xml
<dependencies>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-documents4j-local</artifactId>
<version>8.3.3</version>
</dependency>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-JAXB-ReferenceImpl</artifactId>
<version>8.3.3</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
<version>2.17.1</version>
</dependency>
<!-- API, java.xml.bind module -->
<dependency>
<groupId>jakarta.xml.bind</groupId>
<artifactId>jakarta.xml.bind-api</artifactId>
<version>2.3.2</version>
</dependency>
<!-- Runtime, com.sun.xml.bind module -->
<dependency>
<groupId>org.glassfish.jaxb</groupId>
<artifactId>jaxb-runtime</artifactId>
<version>2.3.2</version>
</dependency>
</dependencies>
EDIT:
After analyzing the source code. What I understood is:
Internally documents4j executes a VBScript with the help of another library zt-exec (ZeroTurnaround Process Executor) to start the WINWORD or for the conversion of DOCX to PDF or to stop the WINWORD
WINWORD program freezes When it executes wordApplication.Documents.Open(inputFile, False, True, False) word_convert
In above statement wordApplication holds the object of Word.Application