0

I am trying to read a docx file into html using XdocReport and convert that html into pdf using htmlworker. It all works fine and I can see that the styling part is also OK in the final pdf.

The only issue is that the contents of the style tag are showing up on the top of the pdf. How do I resolve this ?

FYI...I tried XMLWorker also but it is generating a blank pdf.

HTML

<html>
<head>
<style>
p{margin-top:0pt;margin-bottom:1pt;}p.Normal{margin-bottom:0.0pt;}span.Normal{font-size:12.0pt;}p.TableGrid{margin-bottom:0.0pt;}span.XDocReport_Hyperlink{color:#0000ff;text-decoration:underline;}p.XDocReport_Heading_1{margin-top:24.0pt;margin-bottom:0.0pt;}span.XDocReport_Heading_1{font-family:'Calibri Light';font-size:14.0pt;font-weight:bold;color:#365f91;}p.XDocReport_Heading_2{margin-top:10.0pt;margin-bottom:0.0pt;}span.XDocReport_Heading_2{font-family:'Calibri Light';font-size:13.0pt;font-weight:bold;color:#4f81bd;}p.XDocReport_Heading_3{margin-top:10.0pt;margin-bottom:0.0pt;}span.XDocReport_Heading_3{font-family:'Calibri Light';font-weight:bold;color:#4f81bd;}p.XDocReport_Heading_4{margin-top:10.0pt;margin-bottom:0.0pt;}span.XDocReport_Heading_4{font-family:'Calibri Light';font-weight:bold;font-style:italic;color:#4f81bd;}p.XDocReport_Heading_5{margin-top:10.0pt;margin-bottom:0.0pt;}span.XDocReport_Heading_5{font-family:'Calibri Light';color:#243f60;}p.XDocReport_Heading_6{margin-top:10.0pt;margin-bottom:0.0pt;}span.XDocReport_Heading_6{font-family:'Calibri Light';font-style:italic;color:#243f60;}
</style>
</head>
<body>
// some content
</body>
</html>

Output Generated pdf

Itext Version - 2.1.7

Code

OutputStream file = new FileOutputStream(new File("C:\\Users\\Desktop\\pp.pdf"));
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, file); 
document.open();
InputStream is = new ByteArrayInputStream(bytes);
HTMLWorker htmlWorker = new HTMLWorker(document);
htmlWorker.parse(new StringReader(s)); // s has the entire html
document.close(); 

Once again, the converted pdf looks good with all formatting retained. The only issue is with the style contents showing up on the top.

Naxi
  • 1,504
  • 5
  • 33
  • 72
  • 1
    `HTMLWorker` support for html is very limited. If you want to use it, keep your html very simple and old fashioned. – mkl Sep 06 '19 at 07:42
  • Thanks mkl for letting me know. Which other open source library can I use for this conversion from html to pdf ? – Naxi Sep 06 '19 at 08:29
  • *"Which other open source library can I use for this conversion from html to pdf"* - You should make that a question on the [Software Recommendations Stack Exchange site](https://softwarerecs.stackexchange.com/). – mkl Sep 06 '19 at 09:43
  • You can use iText 7 with the pdfHTML add-on. Is an answer I would give on the Software Recommendations Stack Exchange. – Amedee Van Gasse Sep 10 '19 at 07:22

0 Answers0