I have many long documents that need to be parsed. The document format is like XML but not actually xml.
Here's an example:
<DOC>
<TEXT>it's the content P&G</TEXT>
</DOC>
<DOC>
<TEXT>it's antoher</TEXT>
</DOC>
Note that there are mutiple root tags - <DOC>
, and the entity &
should be &
in xml.
Thus, the above file is not a standard xml.
Can I use the XmlDocument
to parse the file, or should I write my own parser?