1

I have a bunch of XML that all conform to the same schema. A particular element I want to batch edit only occurs exactly once each of the XML files and has an identical xPath in each of these files.

e.g.  /Valid/HeaderInfo/SoftwareID

I want to create a script/procedure so that I can replace the value (I believe it is more accurately called the text value of the node) for this specific element and perform that update to all my XML files in a folder of group of folders. For instance right now it is:

<SoftwareID>12451245</SoftwareID>

I want it to be

<SoftwareID>53623745</SoftwareID>

instead.

I am currently just starting out in the world of programming, as well as learning about XML data in general - and I need some fundamental information about how to even start. What would be the best way to do this? I have Altova XMLSpy and I know there is a scripting component to it. But is it more appropriate to do this in a specific programming language (I am currently learning Visual Basic) or is there some other software that exisits for performing these types of batch updates?

Any information that would be put me in the right direction would be great!

Thanks!

Update (06/26/13)

The XPath to FilingSoftwareId (and the updated element name) is actually:

  ValidFiling/FilingHeader/FilingSoftwareId  

With ValidFiling being the root of the XML document. I used what you provided and updated accordingly but my result is a duplicate of my original XML file when I select this XSL file for XSL Transformation in Altova XMLSpy.

Is it possible that the update to the FilingSoftwareID is being replaced with the original value when the second catch-all template is applied to the document?

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output omit-xml-declaration="yes" indent="yes" method="xml"/>
    <xsl:template match="/ValidFiling/FilingHeader/FilingSoftwareId">
        <FilingSoftwareId>
            <xsl:text>243523452345</xsl:text>
        </FilingSoftwareId>
    </xsl:template>
    <xsl:template match="node()|@*">
        <xsl:copy>
                <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Thanks again!

smk081
  • 783
  • 1
  • 10
  • 36

1 Answers1

1

I would write an identity-translate XSL and then apply that XSL to all the XMLs in that folder using whatever batch technology you wish (.bat, VB app if you like). Write a match for the specific element or elements you wish to change and then include a general template that outputs all the content of everything else as is.

Without testing, it should be something like this:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="//Valid/HeaderInfo/SoftwareID">
    <SoftwareID>
        <xsl:text>53623745</xsl:text>
    </SoftwareID>        
</xsl:template>
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>
</xsl:stylesheet>

You could even expand that example to pass in the new value as a parameter into the transform so you never have to even edit it.

Kevin Brown
  • 8,805
  • 2
  • 20
  • 38
  • Thanks, Kevin! I appreciate the detailed response. Using this XSL, my XML files will still retain their validity to the original schema? (Assuming of course my new value for SoftwareID is valid - which it is). I have not yet tested this but I will need to incorporate all the other elements that exists in schema to make it "complete"? So any programming language can "talk" to the XML and manipulate it? Another aside, I imagine its possible, anything seems to be, but if I wanted to set the new SoftwareID as a sequential number increasing by +1 with each XML, that would be possible? – smk081 Jun 18 '13 at 23:25
  • For the schema part, you would add the schema into the XSL so that they are recognized, processed and output. You do not need to add anything else, the template that matches @*|node is a basic catch-all, it matches everything that is not explicitly matched. You could use any language that can call an XSLT transformer, in VB you would likely use XSLCompiledTransform right in the code. And yes, you could add a parameter that is passed into the transform (like the SoftwareID) and use that param in the XSL. – Kevin Brown Jun 19 '13 at 00:18
  • And for reasons ... I would always use an XML-aware method for manipulating XML as the safest/best way. Certainly one could look at your problem and try to write Regex or string manipulation, but that is not the right way to manipulate XML. You should use XSL or programmatically use the DOM. You could test the above on one XML using Altova XMLSpy or even use it to process the whole directory. – Kevin Brown Jun 19 '13 at 00:26
  • I was finally able to test this solution with my XML file. The result is that it outputs a new file identical to what I originally had (this being from the "catch-all" template I imagine but it does not update the SoftwareID number as specified. The original value was retained. No errors are throw either Do I need to include "//" in the match? If SoftwareID only appears once can I do match="SoftwareID" instead of specifying the entire Xpath to the element? – smk081 Jun 25 '13 at 17:13
  • Without you specifying how you tested it, it is impossible to diagnose. I will tell you that given the input you gave and the XSL I specified, it will replace the SoftwareID with "53623745". If you have a different structure to reach the SOftwareID (than //Valid/HeaderInfo/SoftwareID) then of course. Post you XML structure and specify exactly what you wish replaced. – Kevin Brown Jun 26 '13 at 02:01
  • I just amended my original posting with what I tried and some more detail to what I am experiencing. Thanks a lot! – smk081 Jun 26 '13 at 12:58
  • I did some more experimenting and created a stripped down version of the XML file that I was trying to transform. Your solution worked 100% on the test file but whenever I tried on the actual file(s), no dice. It creates a duplicate of the file and does not update that node for the element. What I found was that the XML file I was trying to transform has namespace declarations in the root element, I found after many iterations of testing when I removed this from the root I was able to transform. Problem is I need these files to be exactly the same with that namespace retained. – smk081 Jun 26 '13 at 15:09
  • I removed some of the Xpath in the match to just go to FilingSoftwareID instead of the full //Valid/HeaderInfo/SoftwareID. Again, this worked on my stripped down test filing without the namespace info in the root element tag but not in the file I ultimately need to transform. – smk081 Jun 26 '13 at 15:11
  • Any other ideas about what might be causing the hang up? – smk081 Jul 02 '13 at 18:54
  • If you have namespace information in the XML, you need to add that namespace into the XSL so that the template actually matches the node in the namespace. – Kevin Brown Jul 04 '13 at 01:47