0

My program reads through a PDF and extracts the text. When it reaches a blank page, I get the error "System.InvalidOperationException: Unable to handle Content of type iTextSharp.text.pdf.PdfDictionary", and the program stops.

How do I check to see if the page is blank before trying to read it? How do I continue in my program if it does hit a blank page?

Code:

for (int i = 1; i <= reader.NumberOfPages; i++)
     output.WriteLine(PdfTextExtractor.GetTextFromPage(reader, i, new SimpleTextExtractionStrategy()));
Kurt Pfeifle
  • 86,724
  • 23
  • 248
  • 345
boilers222
  • 1,901
  • 7
  • 33
  • 71

1 Answers1

1

Something like this?

for (int i = 1; i <= reader.NumberOfPages; i++)
{
    string tmp = PdfTextExtractor.GetTextFromPage(reader, i, 
                     new SimpleTextExtractionStrategy());
    if(!string.IsNullOrEmpty(tmp))
        output.WriteLine(tmp);
}
JP Hellemons
  • 5,977
  • 11
  • 63
  • 128