-2

I have this docx file loaded in my code:

byte[] documentBytes = File.ReadAllBytes("C:\\mydocument.docx");

This document contains the word "foo" in either the main body, header or footer, what is the easiest way to check for the existence of the word "foo"?

yesman
  • 7,165
  • 15
  • 52
  • 117
  • I posted this with my own answer right below, which is why this is a pretty sparse question. – yesman Feb 10 '21 at 12:42
  • while SO highly encourages users to answer their question theirselves, this doesn´t mean the question may lack any relevant information. The rules for posting a question stay exactly as if you did not answer your question, so please provide **every** relevant information into your question. – MakePeaceGreatAgain Feb 10 '21 at 12:52
  • I am always up for improving my own questions. Can you give me a tip on how to in this case? Posting code seems a bit useless in this case. – yesman Feb 10 '21 at 12:53
  • actually you **can** provide your attemps and how they failed. It´s a **question**, remember? So it does not need to contain the **answer**. If I remember right your question yesterday already shows some affords. – MakePeaceGreatAgain Feb 10 '21 at 12:54
  • There was no code in my previous question either. But I will add some extra code to this question so it looks fancy. – yesman Feb 10 '21 at 12:59
  • It´s not about looking fancy, but about being of use to others. – MakePeaceGreatAgain Feb 10 '21 at 13:00

1 Answers1

2

Using OpenXML Powertools:

using OpenXmlPowerTools;

...

byte[] documentBytes = GetMyBytes(); // Load the docx file with File.ReadAllBytes, generate a byte array, etc
using var myStream = new MemoryStream(result, false);
using var myDocument = WordprocessingDocument.Open(myStream, false); // myStream can also be replaced with a path in string format

var regex = new Regex("foo");

int headerCount = OpenXmlRegex.Match(document.MainDocumentPart.HeaderParts.SelectMany(x => x.GetXDocument().Descendants(W.p)), regex);
int footerCount = OpenXmlRegex.Match(document.MainDocumentPart.FooterParts.SelectMany(x => x.GetXDocument().Descendants(W.p)), regex);
int bodyCount = OpenXmlRegex.Match(document.MainDocumentPart.GetXDocument().Descendants(W.p), regex);

The variables headerCount, footerCount and bodyCount represent the number of hits for your regex per part of the document. The MainDocumentPart property also contains properties for images, charts, themes etc.

yesman
  • 7,165
  • 15
  • 52
  • 117
  • 1
    perhaps you want to update the repository link since the repo in the link is no longer maintained. please refer to this instead https://github.com/EricWhiteDev/Open-Xml-PowerTools – Benzara Tahar Feb 10 '21 at 12:32