3

My code:

        FileInputStream pdfFile = new FileInputStream("C:/work/pdf2tiff/test.PDF");
        PDDocument pdDocument = PDDocument.load(pdfFile, true);

        PDDocumentCatalog catalog = pdDocument.getDocumentCatalog();
        List pages = catalog.getAllPages();

        if (pages != null && pages.size() > 0) {

            for (int i = 0; i < pages.size(); i++) {
                PDPage page = (PDPage) pages.get(i);
                Map fonts = page.getResources().getFonts();
                System.out.println("fonts=" + fonts);

I got output:

fonts={F0=org.apache.pdfbox.pdmodel.font.PDType1Font@8aaed5,
F4=org.apache.pdfbox.pdmodel.font.PDType0Font@dc4414, F2=org.apache.pdfbox.pdmodel.font.PDType0Font@f98ce0, F6=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@18fcdce}

Why the fonts map's key is F0/F1/F2/F6? What these mean? Should I iterate all pdf pages get all fonts?

Thanks for your answer.

maxeric
  • 71
  • 4

1 Answers1

0

It seems like the pdf you loaded has multiple fonts loaded. I couldn't figure out any way to retrieve fonts from a document (which I think should be available for us to retrieve since we load fonts to a particular document).

I'm guessing when you load a font into the document it uses "F0", "F1", etc as keys to map to PDFont type. When you print fonts object, it's printing the memory location of the object.

To get all the embedded fonts, you can create a new HashMap() object, then iterate over all the pages and add each font to your HashMap(). Then you can iterate over the keys, get the PDFont font object and, use font.getSubType() to get some sort of description of the font.

Hope this helps. Good luck!

Jay Parekh
  • 41
  • 8