6

I'm using iText in order to convert html into a pdf, but I keep getting a RuntimeWorkerException thrown at parseXHtml. Here's my code:

Document tempDoc = new Document();
PdfWriter pdfWriter = PdfWriter.getInstance(tempDoc, out);
tempDoc.open();
XMLWorkerHelper.getInstance().parseXHtml(pdfWriter, tempDoc, new ByteArrayInputStream(html.getBytes()));
tempDoc.close();

I'm not too familiar with the differences between HTML and XHTML, so I'm at a bit of a loss as to how I should handle this. Here's the html source if it helps.

Drazen Bjelovuk
  • 5,201
  • 5
  • 37
  • 64
  • This sounds like an iText issue... Either because of not handling new html tags, or just a bug. Unfortuntely, there's probably no way around it, though maybe report it to the folks who manage iText? – ControlAltDel Aug 25 '14 at 17:42
  • 1
    The error message is pretty clear, you have a `` tag in the header that isn't closed which is valid in `HTML` but not `XHTML` which is what you are parsing it as. You need to close those, `` – Chris Haas Aug 25 '14 at 17:55

5 Answers5

19

The error message is pretty clear, you have a <meta> tag in the header that isn't closed which is valid in HTML but not XHTML which is what you are parsing it as. You need to close those, <meta ... />

Chris Haas
  • 53,986
  • 12
  • 141
  • 274
1

Remember to close all meta tags

<meta ... />
Flavio Troia
  • 2,451
  • 1
  • 25
  • 27
  • This actually has already been proposed in the [accepted answer](https://stackoverflow.com/a/25493133/1729265)... – mkl Sep 27 '19 at 10:46
0

If you are using XMLWorkerHelper make sure you end image, breakpoint tag properly like />.

0

For a similar error message -

invalid nested tag body found, expected closing tag meta

turned out the XHTML I was parsing had a <script> section at the bottom, that contained JS code, something like:

<script>
  function my_func(var) {
    ...
  }     
</script>

After removing that code (with simple string manipulations), I was able to get the .parseXHtml to work without issues.

Yair Segal
  • 108
  • 1
  • 7
0

you have to close each and every tag. example- in HTML

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

is valid.

But in xhtml you have to use

<meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta>

So close each and every tag in html (example meta tag, col tag, img tag etc).

mkl
  • 90,588
  • 15
  • 125
  • 265