0

I want to parse a PDF document I download with ABCPDF, but I cant find any elements in the document or how to reach them and iterate them. I want to parse out some text.

var webClient = new WebClient();
                                var bytes = webClient.DownloadData("http://test.com/test.pdf");

                                var doc = new Doc();
                                doc.Read(bytes);
Mike Flynn
  • 22,342
  • 54
  • 182
  • 341

1 Answers1

2

Use the Doc.GetText method to extract content from the current page, specifying the format in which content is to be returned.

doc.PageNumber = 1;
string pageContent = doc.GetText("Text");

The example above will return plain text in layout order. Specifying "SVG" or "SVG+" returns additional information along with the text, such as style and position.

AffineMesh
  • 1,025
  • 8
  • 14