1

I am trying to underline justified test in a pdf with itextpdf and I think I have uncovered a bug, and I'd really like a work around.

When I call getBaseline() as described on the mailing lists the underline extends far passed the end of the text into the next column.

        float lx = renderInfos.get(i).getBaseline().getStartPoint().get(0);
        float rx = renderInfos.get(i).getBaseline().getEndPoint().get(0);

enter image description here

You can download the original pdf from the publisher's website

thanks!

I have seen this on all versions of itextpdf I have tried, from 4.1.0 to the most recent 5.5.0.

It would take some effort to separate the underlining code from other proprietary code that I cannot share. If you think it would help, I can do that.

If this is a bug, is there an issue tracker I can log it with?

PS (mkl): Here a short code fragment to reproduce the issue:

PdfReader reader = new PdfReader(...);

PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(...));

for (int page = 1; page <= reader.getNumberOfPages(); page++)
{
    final List<TextRenderInfo> infos = new ArrayList<TextRenderInfo>();
    PdfTextExtractor.getTextFromPage(reader, page, new TextExtractionStrategy()
    {
        public void renderText(TextRenderInfo renderInfo)
        {
            infos.add(renderInfo);
        }

        public void renderImage(ImageRenderInfo renderInfo) { }
        public void endTextBlock() { }
        public void beginTextBlock() { }
        public String getResultantText() { return "";}
    });

    PdfContentByte content = stamper.getOverContent(page);
    for (TextRenderInfo info : infos)
    {
        float lx = info.getBaseline().getStartPoint().get(0);
        float rx = info.getBaseline().getEndPoint().get(0);
        float y = info.getBaseline().getEndPoint().get(1);
        content.moveTo(lx, y);
        content.lineTo(rx, y);
        content.stroke();
    }
}

stamper.close();
mkl
  • 90,588
  • 15
  • 125
  • 265
user833970
  • 2,729
  • 3
  • 26
  • 41

1 Answers1

2

The error underneath this issue is that the OP collects the TextRenderInfo objects he retrieves in renderText in some list renderInfos and uses them afterwards. (In the sample code I added to the question to reproduce the issue, I did likewise using a list infos.)

The TextRenderInfo objects don't store the whole graphics state at the time of their occurrence, neither do they already calculate all properties which can later be queried. Instead when requesting its properties, they are calculated using the information current at the time of the property request.

When calling e.g. a TextRenderInfo instance's getBaseline() method, the base line is calculated using the graphic state of the parser at the time of the getBaseline() call. In case of the code reproducing the issue, this means that the base lines are calculated using the graphics state settings of the content stream at the end of the page. This especially includes graphics state properties like character and word spacing which have an effect on the base line length.

To fix the code of the OP, therefore, all information required from the TextRenderInfo instances must be calculated during the renderText call.

E.g. to fix the code I added to the question to reproduce the issue, it could be changed like this:

PdfReader reader = new PdfReader(...);

PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(...));

for (int page = 1; page <= reader.getNumberOfPages(); page++)
{
    final List<LineSegment> lines = new ArrayList<LineSegment>();
    PdfTextExtractor.getTextFromPage(reader, page, new TextExtractionStrategy()
    {
        public void renderText(TextRenderInfo renderInfo)
        {
            lines.add(renderInfo.getBaseline());
        }

        public void renderImage(ImageRenderInfo renderInfo) { }
        public void endTextBlock() { }
        public void beginTextBlock() { }
        public String getResultantText() { return "";}
    });

    PdfContentByte content = stamper.getOverContent(page);
    for (LineSegment line : lines)
    {
        float lx = line.getStartPoint().get(0);
        float rx = line.getEndPoint().get(0);
        float y = line.getEndPoint().get(1);
        content.moveTo(lx, y);
        content.lineTo(rx, y);
        content.stroke();
    }
}

stamper.close();

Now the base line is calculated during the renderText call and, therefore, is correct:

enter image description here

PS: @Bruno Probably JavaDoc warnings to that effect should be attached to the renderText method and the TextRenderInfo class.

mkl
  • 90,588
  • 15
  • 125
  • 265