0

I have the coordinates of the selected text in the pdf. And I am using PDFTextStripperByArea to add and extract the region to get the text info.

But I want to get the font info of that selected text. When I use getResources() method of PDFTextStripperByArea class it returns null. Here's the sample code-

    PDFTextStripperByArea stripper = new PDFTextStripperByArea();
    stripper.setSortByPosition(true);
    Rectangle2D rect = new Rectangle(96, 150, 101, 11);
    stripper.addRegion("selectedText", rect);
    PDPage firstPage = document.getPage(0);
    stripper.extractRegions(firstPage);
    System.out.println(stripper.getTextForRegion("selectedText"));
    PDResources resources = stripper.getResources();
    // gives a null pointer exception for resources object in the below line
    for (COSName fontName : resources.getFontNames())
    {
        PDFont font = resources.getFont(fontName);

        System.out.println(font.getFontDescriptor().getFontName());
        System.out.println(font.getFontDescriptor().getFontFamily());
        System.out.println(font.getFontDescriptor().getFontWeight());
        System.out.println(font.getName());
        System.out.println(font.getSubType());
    }

Am I doing something wrong or is there any other way to achieve this?

swarupn
  • 21
  • 1
  • 1
    I have improved the javadoc, because that isn't what getResources of that class is for. Please try the PrintTextLocations example, and try changing it to use PDFTextStripperByArea instead of PDFTextStripper. – Tilman Hausherr May 01 '18 at 08:49
  • @swarupn Did a look at `PrintTextLocations` help? – mkl May 04 '18 at 15:18

0 Answers0