At work, we have a word document that we have to edit all the time to pass on to another team, to tell them how to perform some tasks. Since I don't like mindlessly filling out data, and I always look for ways to simplify the tasks I have to do, I decided I would automate this process. After considering a few methods (such as generating a word document from scratch or editing an existing document), I decided to edit the document in-place.
I have inserted special tags into the document (specifically, they take the form [SOME_NAME_HERE]
), and I will then parse the document for those special tags and replace them with the value I actually need. I then extract the .docx to a folder with all of the XML documents inside of it, and parse the document.xml
file, replacing the values.
During this process, depending on what is actually needed, there are sections of the document that will have to be removed from it. So my first thought was to add comments to the document.xml
file. For example:
<!-- INITIAL BUILD ONLY -->
<w:p w:rsidR="00202319" w:rsidRPr="00D00FF5" w:rsidRDefault="00202319" w:rsidP="00AC0192">
<w:r w:rsidR="00E548A2" w:rsidRPr="00D00FF5">
<w:rPr>
<w:rStyle w:val="emcfontstrong"/>
</w:rPr>
<w:t>Some text here</w:t>
</w:r>
</w:p>
<!-- END INITIAL BUILD ONLY -->
Then, when I go to generate the output word document, I would simply remove all of the sections that were "INITIAL BUILD ONLY" (unless, of course, it is the initial build).
However, the issue I am running in to is that when you convert the document back to a Word document, open in Word and save it, it will "cleanup" the document, and remove all of the comments I've added to it.
So, my question is, is there any way to preserve the comments in the document, or is there any special tags I could add to the XML that would not be visible during standard view/edit of the document, but would not be removed by Word upon a save?