Tagging a document region
The neatest way to "tag" a region of a document is to use a content control.
If you use a blocklevel "rich text" content control, then it can contain block level content such as paragraphs and tables, as well as nested content controls.
Here's a simple example of a rich text content control (with some useful properties set).
<w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" >
<w:body>
<w:p>
<w:r>
<w:t>An ordinary top level p</w:t>
</w:r>
</w:p>
<w:sdt>
<w:sdtPr>
<w:alias w:val="my title"/>
<w:tag w:val="my tag"/>
<w:id w:val="1508253281"/>
<w:lock w:val="sdtLocked"/>
</w:sdtPr>
<w:sdtContent>
<w:p >
<w:r>
<w:t>This is a paragraph in a rich text content control.</w:t>
</w:r>
</w:p>
<w:p >
<w:r>
<w:t>Another paragraph </w:t>
</w:r>
</w:p>
<w:tbl>
<!-- table content -->
</w:tbl>
</w:sdtContent>
</w:sdt>
</w:body>
</w:document>
Because a content control's content is inside its sdtContent element, these are nice to manipulate from an XML point of view. (Compare bookmarks, for example, which have bookmarkStart and End point tags, which could have different parent elements!)
Once you have settled on content controls as your solution to your need #1, you have a choice to make regarding your need #2
replacing the content control content with formatted text
Inserting arbitrary content is a little complex, since you have to take care of relationships to other parts. I'd suggest you use code to merge docx files: see Merge multiple word documents into one Open Xml (the Document Builder approach is more robust than altChunk, since altChunk requires that the document be opened in an altChunk aware processor (eg Word or Plutext's) to convert the altChunk to normal docx content)
Alternatively, if you can assume the docx will be opened in Word 2013, you can use w15 richtext databinding. You put your formatted content in a custom XML part (as Flat OPC XML), and Word will automatically update the document with that content.
To get started with this, consider the following sample XML:
Sample XML
<myxml>
<someelement>blagh</someelement>
<yourdb>
<content1>FLAT-OPC</content1>
</yourdb>
</myxml>
Upload it to this service I wrote, and, as described in this blog post, it'll give you a docx back containing a content control with a w15:dataBinding.
Resulting content control
<w:sdt>
<w:sdtPr>
<w15:dataBinding w:prefixMappings="" w:xpath="/myxml[1]/yourdb[1]/content1[1]" w:storeItemID="{115f7b60-1982-4ec7-afc5-28d28886db4b}"/>
<w:richText/>
</w:sdtPr>
<w:sdtContent>
<w:p>
<w:r>
<w:t>Rich Word content can go here</w:t>
</w:r>
</w:p>
</w:sdtContent>
</w:sdt>
After you've edited this in content in Word 2013, inspect the custom XML part:
CustomXML part content
<myxml>
<someelement>blagh</someelement>
<yourdb>
<content1>
<?xml version="1.0" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<pkg:package xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage"><pkg:part pkg:name="/_rels/.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="512"><pkg:xmlData>...</pkg:xmlData></pkg:part></pkg:package>
</content1>
</yourdb>
</myxml>
You can see the element now contains escaped Flat OPC XML.
The beauty of this is:
- that content is self contained; it has everything necessary for it to be rendered (ie all styles, relationships etc)
- the binding is bi-directional. the user will see your database content when they open the document in Word 2013, and if they are allowed to edit that content, and changes they make will be reflected in the Custom XML part (so you can easily save the modified content to a database if you like)