I know of several tools/libraries that can do this but I want to know if this is possible with just opening up the file as a text file and looking for a keyword.
Asked
Active
Viewed 1,362 times
3 Answers
3
have a look at this: http://www.freevbcode.com/ShowCode.asp?ID=8153
Edit: not work, may be too old
Found this:
public static int GetNoOfPagesPDF(string FileName)
{
int result = 0;
FileStream fs = new FileStream(FileName, FileMode.Open, FileAccess.Read);
StreamReader r = new StreamReader(fs);
string pdfText = r.ReadToEnd();
System.Text.RegularExpressions.Regex regx = new Regex(@"/Type\s*/Page[^s]");
System.Text.RegularExpressions.MatchCollection matches = regx.Matches(pdfText);
result = matches.Count;
return result;
}

pinichi
- 2,199
- 15
- 17
-
1FYI - PDF can be written such that you can append changes to the document to the existing file, so if you "delete" pages by appending a new catalog with fewer pages (leaving the old pages in place), this solution will produce incorrect results. – plinth Oct 11 '10 at 17:30
-
The above code didn't work for me, returning more than the correct number of pages. But it made me realize that much of a pdf is text and I was able to find it with Regex (non-global match) `/Type /Pages\nCount ([0-9]+)`. – ErikE Apr 06 '13 at 00:58
1
[Edit: based on the edited question]
It is possible by reading it as text file and some minimal parsing.
If you read the pdf yourself then you will need to do the parsing. Each page in a PDF is represented by a page object.
The following provides an understanding about the pdf specification in short for pages and the link to the pdf spec.

pyfunc
- 65,343
- 15
- 148
- 136
-
Preferring pinichi's answer as it has working code. Voting up your answer because it's very helpful. – Chry Cheng Oct 05 '10 at 07:19
-1
The xpdf utilities package (called xpdf-utils in debian) includes an application called pdfinfo. It will print out the number of pages in the file, among other data.
http://www.linuxquestions.org/questions/programming-9/how-to-find-pdf-page-count-699113/

Gadolin
- 2,636
- 3
- 28
- 33
-
Sorry, not what I'm looking for. Edited my question's description to clarify further. – Chry Cheng Oct 05 '10 at 06:52