I would like to count the number of pages in a RTF or MS Word document, using python. Is this possible?
Asked
Active
Viewed 1,796 times
4
-
If you know the number of lines in a file, then you could count the newlines and divide. If this is a specific setting for a program like MS Word, then that technique would fail – inspectorG4dget Nov 02 '12 at 20:59
2 Answers
4
Not without rendering the actual page.
The number of pages will depend on many things, such as the size of the fonts being used, the margins in all four directions on the page, and the insertion of any other sized artifacts such as images.
So what you would have to do is render the document in an RTF library of some sort, and let that library tell you how many pages there are.

Robert Harvey
- 178,213
- 47
- 333
- 501
-
3+1. It's worth noting that even Word can't count the number of pages without rendering the document, which is why you get a "…" when it first loads up the document (or after reformatting) as it counts the pages on the fly. – abarnert Nov 02 '12 at 21:59
0
You could fire up Word, and use OLE automation to ask it for the number of pages.

Katriel
- 120,462
- 19
- 136
- 170
-
-
OK, same answer with [OpenOffice and `py-uno`](http://stackoverflow.com/questions/2256881/how-can-i-count-words-in-complex-documents-rtf-doc-odt-etc) =) – Katriel Nov 06 '12 at 08:33
-