0

I am trying to strip Carriage Return and Linefeed (#13#10/#$D#$A) from my WideString before I use the IXMLDomDocument.LoadXML() function:

OleDoc := CreateOleObject('Microsoft.XMLDOM') as IXMLDomDocument;
try
  WideDecoded := StringReplace(WideDecoded,#13#10,'',[rfReplaceAll]);
  WideDecoded := trim(WideDecoded);
  OleDoc.loadXML(WideDecoded);
  OleDoc.parseError.linepos;
  OleDoc.parseError.srcText;
  OleDoc.parseError.url;
  OleDoc.parseError.line;
  OleDoc.parseError.reason;
  if OleDoc.parseError.errorCode <> 0 then
    raise Exception.Create('XML Load error:' + OleDoc.parseError.reason);
finally
  OleDoc := nil;
end;

The parseError.reason that I'm getting is:

an Invalid Character was found in the text content '#$D#$A'

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
mobo
  • 43
  • 5
  • CR and LF characters are not invalid in XML. If present between XML elements, they produce extra text nodes in the DOM tree, unless the parser is configured to ignore them. If they are present inside an element's content, they are treated as any other text character. Also, you shouldn't be looking at the `parseError` values unless `loadXML()` returns false or the `parseError.errorCode` is not 0, neither of which you are checking for, eg: `if not OleDoc.loadXML(WideDecoded) then begin ... end;` or `OleDoc.loadXML(WideDecoded); if OleDoc.parseError.errorCode <> 0 then begin ... end;` – Remy Lebeau Feb 04 '21 at 19:29
  • @RemyLebeau sorry about that i actually had it in my code but forgot to add it – mobo Feb 04 '21 at 19:51
  • Reading the `parseError`'s `linepos`, `srcText`, `url`, and `line` properties without doing anything with those values is useless. Either add them to your `Exception`'s message, otherwise they are just wasting memory and should be removed completely. That being said, you still haven't shown the actual XML you are having trouble parsing. Your code is fine, but I suspect the actual XML doesn't really contain raw CRLFs in it. Maybe it has bare-LF and bare-CR characters in it. Or maybe the CRLFs are encoded in a way that `StringReplace()` doesn't see them. Can't say without seeing the actual XML – Remy Lebeau Feb 04 '21 at 20:03
  • Also, note that `StringReplace()` doesn't support `WideString`, so you are actually converting `WideString` to `string`, and then converting `string` back to `WideString`. – Remy Lebeau Feb 04 '21 at 21:37

0 Answers0