3

I'm using itextpdf-5.0.6.jar (Java 8) and when I try to export html code with base64 image tag I get file not found exception.

if I remove the image tag everything works great!

I found few solutions about overriding image tag processor but most of them are old and not compatiable with the 5.0.6 version.

Here is the HTML I send:

    "<!doctype html>\n<html lang=\"en\">\n<head>\n    
<meta charset=\"UTF-8\">\n    
<title>Test PDF</title>\n</head>\n<body>\n\n
<div class=\"pdf-header\">\n\n 
  <img src=\"\">     \n\n\n</div>\n\n<div class=\"main\">\n<div class=\"canvas\">\nHellow world</div></div></body>\n</html>"

part of my code:

fileOutputStream = new FileOutputStream(file);
Document document = new Document();
PdfWriter.getInstance(document, fileOutputStream);
document.open();
HTMLWorker htmlWorker = new HTMLWorker(document);
StringReader stringReader = new StringReader(htmlCode);
htmlWorker.parse(stringReader);
document.close();
fileOutputStream.close();

any help will be appricated thanks

Bruno Lowagie
  • 75,994
  • 9
  • 109
  • 165
Lior Bachar
  • 103
  • 2
  • 8

1 Answers1

3

Please stop using HTMLWorker, as repeated many times on StackOverflow, the HTMLWorker class has been abandoned in favor of XML Worker a long time ago. We won't invest in further development of HTMLWorker so it's a very bad choice to use it. Please switch to XML Worker.

Also upgrade to the latest iText version, the version you are using dates from February 4, 2011, many bugs have been fixed in the 4 years that have passed. Make sure you have both the iText jar and the XML Worker jar with the same version number.

Base64 images aren't supported yet, but I have made you a very simple Proof of Concept, showing how easy it is to add support for such images. Take a look at the ParseHtml4 example and the resulting PDF: html_4.pdf.

To achieve this, you need to write an implementation of the ImageProvider interface. I have done this by extending the AbstractImageProvider class:

class Base64ImageProvider extends AbstractImageProvider {

    @Override
    public Image retrieve(String src) {
        int pos = src.indexOf("base64,");
        try {
            if (src.startsWith("data") && pos > 0) {
                byte[] img = Base64.decode(src.substring(pos + 7));
                return Image.getInstance(img);
            }
            else {
                return Image.getInstance(src);
            }
        } catch (BadElementException ex) {
            return null;
        } catch (IOException ex) {
            return null;
        }
    }

    @Override
    public String getImageRootPath() {
        return null;
    }
}

As you can see, I check for the existence of "base64," in whatever is passed to XML Worker through the src attribute of the img tag. If that String is present, I decode whatever follows that "base64," and I return an Image object that is created using the resulting bytes.

Once you have this ImageProvider implementation, it's only a matter of passing it to XML Worker.

Bruno Lowagie
  • 75,994
  • 9
  • 109
  • 165
  • The solution working as expected! thanks a lot. Bruno, is there an option to ignore bad html ? I first got error for element that doesn't contain closer, and than for - can I ignore this validation ? – Lior Bachar Mar 23 '15 at 11:56
  • I use JSoup to fix the HTML: http://stackoverflow.com/questions/26652029/how-to-do-html-to-xml-conversion-to-generate-closed-tags – Bruno Lowagie Mar 23 '15 at 11:58
  • I send this html: http://pastebin.com/Ug1M8n43 but it doesn't choose the css that is embedded with – Lior Bachar Mar 23 '15 at 12:04
  • CSS is supported in XML Worker (not in `HTMLWorker`), but obviously not all CSS attributes are supported for 2 reasons: (1.) PDF is very different from HTML and some attributes that exist for HTML will never work for PDFs created with iText, (2.) We only implemented the CSS attributes we were paid for (many people use iText and XML Worker without paying). – Bruno Lowagie Mar 23 '15 at 12:09
  • I am only one person and I am most definitely not a sales person. You should post your question here: http://itextpdf.com/sales – Bruno Lowagie Mar 23 '15 at 12:25