7

I have about 400 files in .docx format, and I need to determine the length of each in #pages.

So, I want to write C# code for selecting the folder that contains the documents , and then returns the #pages of each .docx file.

Nasreddine
  • 36,610
  • 17
  • 75
  • 94
AyaZoghby
  • 341
  • 3
  • 7
  • 16

4 Answers4

20

To illustrate how this can be done, I have just created a C# console application based on .NET 4.5 and some of the Microsoft Office 2013 COM objects.

using System;
using Microsoft.Office.Interop.Word;

namespace WordDocStats
{
    class Program
    {
        // Based on: http://www.dotnetperls.com/word
        static void Main(string[] args)
        {
            // Open a doc file.
            var application = new Application();
            var document = application.Documents.Open(@"C:\Users\MyName\Documents\word.docx");

            // Get the page count.
            var numberOfPages = document.ComputeStatistics(WdStatistic.wdStatisticPages, false);

            // Print out the result.
            Console.WriteLine(String.Format("Total number of pages in document: {0}", numberOfPages));

            // Close word.
            application.Quit();
        }
    }
}

For this to work you need to reference the following COM objects:

  • Microsoft Office Object Library (version 15.0 in my case)
  • Microsoft Word Object Library (version 15.0 in my case)

The two COM objects gives you access to the namespaces needed.

For details on how to reference the correct assemblies, please refer to section: "3. Setting Up Work Environment:" at: http://www.c-sharpcorner.com/UploadFile/amrish_deep/WordAutomation05102007223934PM/WordAutomation.aspx

For a quick and more general introduction to Word automation through C#, see: http://www.dotnetperls.com/word

-- UPDATE

Documentation about the method Document.ComputeStatistics that gives you access to the page count can be found here: http://msdn.microsoft.com/en-us/library/microsoft.office.tools.word.document.computestatistics.aspx

As seen in the documentation, the method takes a WdStatistic enum that enables you to retrieve different kinds of stats, e.g., the total amount of pages. For an overview of the complete range of stats you have access to, please refer to the documentation of the WdStatistic enum, which can be found here: http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.wdstatistic.aspx

Lasse Christiansen
  • 10,205
  • 7
  • 50
  • 79
4

use DocumentFormat.OpenXml.dll you can find dll in C:\Program Files\Open XML SDK\V2.0\lib

Sample code:

DocumentFormat.OpenXml.Packaging.WordprocessingDocument doc = DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open(docxPath, false);
            MessageBox.Show(doc.ExtendedFilePropertiesPart.Properties.Pages.InnerText.ToString());

to use DocumentFormat.OpenXml.Packaging.WordprocessingDocument class you need to add following references in your project

DocumentFormat.OpenXml.dll & Windowsbase.dll

Jignesh Thakker
  • 3,638
  • 2
  • 28
  • 35
1

You can use Spire.Doc page count is free :)

using Spire.Doc;
    public sealed class TestNcWorker
    {
        [TestMethod]
        public void DocTemplate3851PageCount()
        {
            var docTemplate3851 = Resource.DocTemplate3851;
            using (var ms = new MemoryStream())
            {
                ms.Write(docTemplate3851, 0, docTemplate3851.Length);
                Document document = new Document();
                document.LoadFromStream(ms, FileFormat.Docx);
                Assert.AreEqual(2,document.PageCount);
            }
            var barCoder = new BarcodeAttacher("8429053", "1319123", "HR3514");
            var barcoded = barCoder.AttachBarcode(docTemplate3851).Value;
            using (var ms = new MemoryStream())
            {
                ms.Write(barcoded, 0, barcoded.Length);
                Document document = new Document();
                document.LoadFromStream(ms, FileFormat.Docx);
                Assert.AreEqual( 3, document.PageCount);

            }
        }
    }
0

Modern solution (based on Jignesh Thakker's answer): Open XML SDK is no longer there, but it is published on Github and even support .NET Core. You do not need MS Office on the server/running machine.

Install the Nuget package:

Install-Package DocumentFormat.OpenXml

The code:

using DocumentFormat.OpenXml.Packaging;

private int CountWordPage(string filePath)
{
    using (var wordDocument = WordprocessingDocument.Open(filePath, false))
    {
        return int.Parse(wordDocument.ExtendedFilePropertiesPart.Properties.Pages.Text);
    }
}
Luke Vo
  • 17,859
  • 21
  • 105
  • 181