1

I need to read file details, especially, Authors, Title, Subject, from new Office files (.docx, .xlsx). I found this article from MS, which also has some methods - http://msdn.microsoft.com/en-us/library/bb739835%28v=office.12%29.aspx But I can seem to make this work. Method I'm using is:

public static string WDRetrieveCoreProperty(string docName, string propertyName)
{
   // Given a document name and a core property, retrieve the value of the property.
   // Note that because this code uses the SelectSingleNode method, 
   // the search is case sensitive. That is, looking for "Author" is not 
   // the same as looking for "author".

   const string corePropertiesSchema = "http://schemas.openxmlformats.org/package/2006/metadata/core-properties";
   const string dcPropertiesSchema = "http://purl.org/dc/elements/1.1/";
   const string dcTermsPropertiesSchema = "http://purl.org/dc/terms/";

   string propertyValue = string.Empty;

   using (WordprocessingDocument wdPackage = WordprocessingDocument.Open(docName, true))
   {
      // Get the core properties part (core.xml).
      CoreFilePropertiesPart corePropertiesPart = wdPackage.CoreFilePropertiesPart;

      // Manage namespaces to perform XML XPath queries.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("cp", corePropertiesSchema);
      nsManager.AddNamespace("dc", dcPropertiesSchema);
      nsManager.AddNamespace("dcterms", dcTermsPropertiesSchema);

      // Get the properties from the package.
      XmlDocument xdoc = new XmlDocument(nt);

      // Load the XML in the part into an XmlDocument instance.
      xdoc.Load(corePropertiesPart.GetStream());

      string searchString = string.Format("//cp:coreProperties/{0}", propertyName);

      XmlNode xNode = xdoc.SelectSingleNode(searchString, nsManager);
      if (!(xNode == null))
      {
         propertyValue = xNode.InnerText;
      }
   }

   return propertyValue;
}

So I'm calling this method like:

WDRetrieveCoreProperty(textBox1.Text, "Authors"); 
// textBox1 has path to some .docx file

But it always returns null. So what is wrong with this?

Toto
  • 89,455
  • 62
  • 89
  • 125
andree
  • 3,084
  • 9
  • 34
  • 42

2 Answers2

2

I know that this question is old, but ran across it while researching the same issue. The example on MSDN has the sample code for the method to retrieve the core properties, but does not have an example using the method.

When passing the property to find you have to include the namespace prefix. So accessing the lastModifiedBy core property using the OP method would look like:

WDRetrieveCoreProperty(textBox1.Text, "cp:lastModifiedBy");
1

I did this...

using System.IO.Packaging; // Assembly WindowsBase.dll
  :
     static void Main(string[] args)
     {
        String path = Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData);
        String file = Path.Combine(path, "Doc1.docx");

        Package docx = Package.Open(file, FileMode.Open, FileAccess.Read);
        String subject = docx.PackageProperties.Subject;
        String title = docx.PackageProperties.Title;
        docx.Close();
     }
Black Light
  • 2,358
  • 5
  • 27
  • 49