I am having trouble getting consistent result using itext parser. This is the code
public void parsePdf(String pdf) throws IOException {
PdfReader reader = new PdfReader(pdf);
Rectangle rect = new Rectangle(370,280, 380, 613);
RenderFilter filter = new RegionTextRenderFilter(rect);
TextExtractionStrategy strategy;
strategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), filter);
s=PdfTextExtractor.getTextFromPage(reader, 1, strategy);
reader.close();
System.out.println(s);
}
I am creating pdfs with report manager. Templates for two types of files are different but the positioning of the fields that I want to extract is the same.
I am using LocationStrategy. The rectangle is pointing to the position that I want to parse. When printed on paper the field in question is in the same position, so my guess is that is should parse the same, but that is not the case. First doc gives me expected results, but when I parse the second with the same coordinates for my rectangle I am parsing something that is two lines above the expected place. Hope this is a better explanation.
I set the templates in report manager so that the target field is at the same position, with same font size, spacing, same document header for both pdfs as it is evident when printed out, but when parsed i get two lines offset.