0

Environment

Visual Studio 2017 C# (Word .docx file)

Problem

The find/replace only replaces "{Today}" - it fails to replace the "{ConsultantName}" field. I've checked the document and tried using different approaches (see commented-out code) but no joy.

The Word document has just a few paragraphs of text - there are no tables or text boxes in the document. What am I doing wrong?

Update

When I inspect doc_text string, I can see "{Today}" but "{ConsultantName}" is split into multiple runs. The opening and closing braces are not together with the word - there are XML tags between them:

{</w:t></w:r><w:proofErr w:type="spellStart"/><w:r w:rsidR="00544806"><w:t>ConsultantName</w:t></w:r><w:proofErr w:type="spellEnd"/><w:r w:rsidR="00544806"><w:t>}

Code

    string doc_text = string.Empty;
    List<string> s_find = new List<string>();
    List<string> s_replace = new List<string>();
    // Regex regexText = null;

    s_find.Add("{Today}");
    s_replace.Add("24 Sep 2018");
    s_find.Add("{ConsultantName}");
    s_replace.Add("John Doe");

    using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(filePath, true))
    {
        // read document
        using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
        {
            doc_text = sr.ReadToEnd();
        }

        // find replace
        for (byte b = 0; b < s_find.Count; b++)
        {
            doc_text = new Regex(s_find[b], RegexOptions.IgnoreCase).Replace(doc_text, s_replace[b]);
            // regexText = new Regex(s_find[b]);
            // doc_text = doc_text.Replace(s_find[b], s_replace[b]);
            // doc_text = regexText.Replace(doc_text, s_replace[b]);
        }

        // update document
        using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
        {
            sw.Write(doc_text);
        }
    }
Ross Kelly
  • 477
  • 1
  • 6
  • 23
  • This might be useful to you - https://stackoverflow.com/questions/28697701/openxml-tag-search/28719853#28719853 – petelids Sep 24 '18 at 11:23

2 Answers2

2

Note: I want to avoid using Word Interop. I don't want to create an instance of Word and use Word's object model to do the Find/Replace.

There is no way to avoid Word splitting text into multiple runs. It happens even if you type text directly into the document, make no changes and apply no formatting.

However, I found a way around the problem by adding custom fields to the document as follows:

  • Open Word document. Go to File->Info
  • Click the Properties heading and select Advanced Properties.
  • Select the Custom tab.
  • Add the field names you want to use and Save.
  • In the document click Insert on the main menu.
  • Click Explore Quick Parts icon and select Field...
  • Drop-down Categories and select Document Information.
  • Under Field names: select DocProperty.
  • Select your custom field name in the "Property" list and click ok.

This inserts the field into your document and even if you apply formatting, the field name will be whole and not be broken into multiple runs.

Update

To save users the laborious task of manually adding a lot of custom properties to a document, I wrote a method to do this using OpenXML.

Add the following usings:

using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.CustomProperties;
using DocumentFormat.OpenXml.VariantTypes;

Code to add custom (text) properties to the document:

static public bool RunWordDocumentAddProperties(string filePath, List<string> strName, List<string> strVal)
{
    bool is_ok = true;
    try
    {
        if (File.Exists(filePath) == false)
            return false;                

        using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(filePath, true))
        {
            var customProps = wordDoc.CustomFilePropertiesPart;
            if (customProps == null)
            {
                // no custom properties? Add the part, and the collection of properties
                customProps = wordDoc.AddCustomFilePropertiesPart();
                customProps.Properties = new DocumentFormat.OpenXml.CustomProperties.Properties();
            }
            for (byte b = 0; b < strName.Count; b++)
            {
                var props = customProps.Properties;                        
                if (props != null)
                {
                    var newProp = new CustomDocumentProperty();
                    newProp.VTLPWSTR = new VTLPWSTR(strVal[b].ToString());
                    newProp.FormatId = "{D5CDD505-2E9C-101B-9397-08002B2CF9AE}";
                    newProp.Name = strName[b];

                    // append the new property, and fix up all the property ID values
                    // property ID values must start at 2
                    props.AppendChild(newProp);
                    int pid = 2;
                    foreach (CustomDocumentProperty item in props)
                    {
                        item.PropertyId = pid++;
                    }
                    props.Save();
                }
            }                    
        }
    }
    catch (Exception ex)
    {
        is_ok = false;
        ProcessError(ex);
    }
    return is_ok;
}
Ross Kelly
  • 477
  • 1
  • 6
  • 23
0

You only need to do this:

*.csproj

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>netcoreapp3.1</TargetFramework>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="DocumentFormat.OpenXml" Version="2.12.3" />
  </ItemGroup>

</Project>

add these packages:

using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;

and put this code in your system

using (WordprocessingDocument wordprocessingDocument =
            WordprocessingDocument.Open(filepath, true))
        {
            var body = wordprocessingDocument.MainDocumentPart.Document.Body;

            var paras = body.Elements<Paragraph>();

            foreach (var para in paras)
            {
                foreach (var run in para.Elements<Run>())
                {
                    foreach (var text in run.Elements<Text>())
                    {
                        if (text.Text.Contains("#_KEY_1_#"))
                        {
                            text.Text = text.Text.Replace("#_KEY_1_#", "replaced-text");
                        }
                    }
                }
            }
        }

done

Felipe Augusto
  • 1,341
  • 1
  • 16
  • 18