0

I am converting docx to pdf which works fine. But the problem is it misses the formatting the document. How to ensure the formatting such as [bold, tablets etc]] are not lost while converting.

I am using docx4j for conversion.

Below is the exception I am getting

NOT IMPLEMENTED: support for w:ptab -
NOT IMPLEMENTED: support for w:ptab -3
NOT IMPLEMENTED: support for w:altChunk -
NOT IMPLEMENTED: support for w:altChunk -
NOT IMPLEMENTED: support for w:altChunk -
NOT IMPLEMENTED: support for w:altChunk -
NOT IMPLEMENTED: support for w:altChunk -
NOT IMPLEMENTED: support for w:altChunk -
Shane
  • 5,517
  • 15
  • 49
  • 79
  • Hehe. I had the same issue. Someone answered: "Correctly parsing a .docx and then generating a PDF from it is very hard. Writing something that remotely works costs man-years (plural)". In the end I used Libreoffice in unix with "libreoffice --headless --convert-to pdf filename.docx" - the result is quite good but convertings always shreds the layout :( – user2718671 Apr 07 '14 at 08:25
  • What version of docx4j are you using? Bold should generally appear .. and with the current docx4j nightly, you won't need that setHeaderExtent trick. – JasonPlutext Apr 07 '14 at 08:35
  • @JasonPlutext: I am using 3.0.1 version. I am getting the NOT IMPLEMENTD: support for w:altChunk error. – Shane Apr 07 '14 at 08:43

1 Answers1

1

ptab: as it says, there is currently no support for the ptab element. Its not commonly used. You can either remove it from your docx, or we could look at adding support for it to docx4j.

altChunk: these need to be preprocessed into "real" docx content for docx4j's PDF output. If the altChunk is of type XHTML, docx4j can do that. If it is a docx altChunk, the Enterprise Edition is required. There are other types of altChunks, which you should avoid if you want to convert to PDF...

JasonPlutext
  • 15,352
  • 4
  • 44
  • 84