I have an XML that has a tag value like the following:
<ProjectNote>
<Note><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE></TITLE>
<META http-equiv=Content-Type content="text/html; charset=unicode">
<META content="MSHTML 6.00.3790.4944" name=GENERATOR></HEAD>
<BODY bgColor=#ffffff>
<P>Key Deliverables</P>
<UL>
<LI>schedule development
<LI>scope development (SOW)
<LI>business case (depending on project)
<LI>contracts (who will be used)
<LI>overall budget
<LI>Assign Key Stakeholders
<LI>Sitewalks and PreCon Meetings
<LI>Need Clearance?</LI></UL>
<P>&nbsp;</P></BODY></HTML>
</Note>
<ProjectNote>
I am reading this file with groovy script and making some changes to it and writing it back to the file. However, the "
is getting converted to "
while parsing the file with XmlSluper. I don't want to change any other section of the file other than adding a new nodeto it. How can I keep the file as it is?
I am using following code:
package test
import groovy.xml.*
/**
* A Simple Example that searches information from XML parsed by XmlSlurper.
*/
class Test {
static srcXMLPath = "C:/SRC_Project/628548_C453_Original.xml"
static updXMLPath = "C:/SRC_Project/628548_C453_Updated.xml"
static def writer
static main(args) {
File srcFile = new File(srcXMLPath)
def baseXMLStr = new XmlSlurper(false,false).parse(srcFile)
def newXMLStr = new groovy.xml.StreamingMarkupBuilder().bind {
List_Wrapper {
mkp.yield baseXMLStr
}
}
writer = new FileWriter(updXMLPath)
groovy.xml.XmlUtil.serialize( newXMLStr,writer )
writer.close()
}
}
However the updated file gets changed to this which is not my intention:
<ProjectNote>
<Note><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE></TITLE>
<META http-equiv=Content-Type content="text/html; charset=unicode">
<META content="MSHTML 6.00.3790.4944" name=GENERATOR></HEAD>
<BODY bgColor=#ffffff>
<P>Key Deliverables</P>
<UL>
<LI>As Builts (if needed)
<UL>
<LI>Mapping &amp; Design Drawings</LI></UL>
<LI>Engineer needs final approval
<P>&nbsp;</P></BODY></HTML>
</Note>
<ProjectNote>
Could someone let me know how to avoid it. it is clearly not changing other escape characters