2

By the help of some very kind community members here I managed to programatically create a function to replace text inside content controls in a Word document using open xml. After the document is generated it removes the formatting of the text after I replace the text.

Any ideas on how I can still keep the formatting in word and remove the content control tags ?

This is my code:

using (var wordDoc = WordprocessingDocument.Open(mem, true))
{

    var mainPart = wordDoc.MainDocumentPart;

    ReplaceTags(mainPart, "FirstName", _firstName);
    ReplaceTags(mainPart, "LastName", _lastName);
    ReplaceTags(mainPart, "WorkPhoe", _workPhone);
    ReplaceTags(mainPart, "JobTitle", _jobTitle);

    mainPart.Document.Save();
    SaveFile(mem);
}

private static void ReplaceTags(MainDocumentPart mainPart, string tagName,    string tagValue)
{
    //grab all the tag fields
    IEnumerable<SdtBlock> tagFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
        (r => r.SdtProperties.GetFirstChild<Tag>().Val == tagName);

    foreach (var field in tagFields)
    {
        //remove all paragraphs from the content block
        field.SdtContentBlock.RemoveAllChildren<Paragraph>();
        //create a new paragraph containing a run and a text element
        Paragraph newParagraph = new Paragraph();
        Run newRun = new Run();
        Text newText = new Text(tagValue);
        newRun.Append(newText);
        newParagraph.Append(newRun);
        //add the new paragraph to the content block
        field.SdtContentBlock.Append(newParagraph);
    }
}
petelids
  • 12,305
  • 3
  • 47
  • 57
Ilyas
  • 295
  • 5
  • 19

1 Answers1

1

Keeping the style is a tricky problem as there could be more than one style applied to the text you are trying to replace. What should you do in that scenario?

Assuming a simple case of one style (but potentially over many Paragraphs, Runs and Texts) you could keep the first Text element you come across per SdtBlock and place your required value in that element then delete any further Text elements from the SdtBlock. The formatting from the first Text element will then be maintained. Obviously you can apply this theory to any of the Text blocks; you don't have to necessarily use the first. The following code should show what I mean:

private static void ReplaceTags(MainDocumentPart mainPart, string tagName, string tagValue)
{
    IEnumerable<SdtBlock> tagFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
        (r => r.SdtProperties.GetFirstChild<Tag>().Val == tagName);

    foreach (var field in tagFields)
    {
        IEnumerable<Text> texts = field.SdtContentBlock.Descendants<Text>();

        for (int i = 0; i < texts.Count(); i++)
        {
            Text text = texts.ElementAt(i);

            if (i == 0)
            {
                text.Text = tagValue;
            }
            else
            {
                text.Remove();
            }
        }
    }
}
petelids
  • 12,305
  • 3
  • 47
  • 57
  • When text includes paragraph,

    Text

    , the text is not added to the document file, but single value texts are added and they keep the style they got. Any idea how I can keep multi text value text added also with the style intact?
    – Ilyas Mar 17 '15 at 07:50
  • 1
    By "Text" do you mean the text being replaced or the text you are replacing it with? I've tested both scenarios and I can't get either to fail with the above code. – petelids Mar 17 '15 at 12:29
  • Sorry, bad explanation. The text is added/replaced to the content control, but with html-encoding, like this

    some awsome text

    Yo yo

    . I tried to remove the html-encoding, but then of course it will all come in the same line without the paragraph, so trying to sort out how to have html-encoded text without showing the html-tags.

    – Ilyas Mar 17 '15 at 15:42
  • Ah OK, that makes sense. The trouble is the Docx format is *XML* not HTML so the output you are seeing is what I'd expect. I think you'd be better off using the code in [this answer](http://stackoverflow.com/a/29099130/3791802) but for each

    element you have create a new `Paragraph`, `Run` and `Text`. You can use `InsertAfter` to insert each paragraph after the previous one. If you assign the `RunProperties` to each `Run` you'll keep your styles as well.

    – petelids Mar 17 '15 at 17:43
  • That sounds complicated. As you must have figured out by now, I have not yet quite understood all this stuff with open XML sddt, runs etc. :). Is it possible to cast the code to some XML or run it through a RegEx before inserting it in to the content control ? – Ilyas Mar 17 '15 at 20:21
  • 1
    Hi @Ilyas, I'm not too sure to be honest. I guess you could use something like the [HtmlAgilityPack](http://htmlagilitypack.codeplex.com/) to parse the HTML (I would avoid using Regex). Alternatively you could take a look at a converter such as [this one](https://html2openxml.codeplex.com/). I've never tried it so I've no idea what it's like to use. – petelids Mar 18 '15 at 12:30
  • Thanks for the tips. I will try searching a bit more, and test this one out also. – Ilyas Mar 22 '15 at 17:12