26

Question: How do I syntax-check my XML in modern browsers (anything but IE)?

I've seen a page on W3Schools which includes an XML syntax-checker. I don't know how it works, but I'd like to know how I may achieve the same behavior.

I've already performed many searches on the matter (with no success), and I've tried using the DOM Parser to check if my XML is "well-formed" (also with no success).

var xml = 'Caleb';
var parser = new DOMParser();
var doc = parser.parseFromString(xml, 'text/xml');

I expect the parser to tell me I have an XML syntax error (i.e. an unclosed name tag). However, it always returns an XML DOM object, as if there were no errors at all.

To summarize, I would like to know how I can automatically check the syntax of an XML document using JavaScript.

P.S. Is there any way I can validate an XML document against a DTD (using JS, and not IE)?

caleb531
  • 4,111
  • 6
  • 31
  • 41

6 Answers6

33

Edit: Here is a more concise example, from MDN:

var xmlString = '<a id="a"><b id="b">hey!</b></a>';
var domParser = new DOMParser();
var dom = domParser.parseFromString(xmlString, 'text/xml');

// print the name of the root element or error message
dump(dom.documentElement.nodeName == 'parsererror' ? 'error while parsing' : dom.documentElement.nodeName);
Tsvetan Ganev
  • 8,246
  • 4
  • 26
  • 43
NoBugs
  • 9,310
  • 13
  • 80
  • 146
9

NoBugs answer above did not work with a current chrome for me. I suggest:

var sMyString = "<a id=\"a\"><b id=\"b\">hey!<\/b><\/a>";
var oParser = new DOMParser();
var oDOM = oParser.parseFromString(sMyString, "text/xml");
dump(oDOM.getElementsByTagName('parsererror').length ? 
     (new XMLSerializer()).serializeToString(oDOM) : "all good"    
);
Henryk Gerlach
  • 146
  • 1
  • 4
4

You can also use the package fast-xml-parser, this package have a validate check for xml files:

import { validate, parse } from 'fast-xml-parser';

if( validate(xmlData) === true) {
  var jsonObj = parse(xmlData,options);
}
Lucas Andrade
  • 4,315
  • 5
  • 29
  • 50
  • 2
    This is deprecated since the `v4` you should now import and use the abstract class `XMLValidator` like `XMLValidator.validate(xmlData)`. Have a look at the official [documentation](https://github.com/NaturalIntelligence/fast-xml-parser/blob/master/docs/v4/4.XMLValidator.md). – johannchopin Dec 13 '21 at 22:43
3

Just F12 to enter developer mode and check the source there you can then search validateXML and you are to locate a very long complete XML checker for your reference.

I am using react and stuff using the DOMParser to present the error message as:

  handleXmlCheck = () => {
    const { fileContent } = this.state;
    const parser = new window.DOMParser();
    const theDom = parser.parseFromString(fileContent, 'application/xml');
    if (theDom.getElementsByTagName('parsererror').length > 0) {
      showErrorMessage(theDom.getElementsByTagName('parsererror')[0].getElementsByTagName('div')[0].innerHTML);
    } else {
      showSuccessMessage('Valid Xml');
    }
  }
Hearen
  • 7,420
  • 4
  • 53
  • 63
2

Basic xml validator in javscript. This code may not valid for advance xml but basic xml.

function xmlValidator(xml){
    // var xml = "<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>";
    while(xml.indexOf('<') != -1){
        var sub = xml.substring(xml.indexOf('<'), xml.indexOf('>')+1);
        var value = xml.substring(xml.indexOf('<')+1, xml.indexOf('>'));
        var endTag = '</'+value+'>';
        if(xml.indexOf(endTag) != -1){
            // console.log('xml is valid');
            // break;
        }else{
            console.log('xml is in invalid');
            break;
        }
        xml = xml.replace(sub, '');
        xml = xml.replace(endTag, '');
        console.log(xml);
        console.log(sub+' '+value+' '+endTag);
    }
}
var xml = "<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>";
xmlValidator(xml);
  • 1
    nice solution, but it did not give you where the line where the error happened, as you said its a 'Basic' – Basheer AL-MOMANI Apr 06 '20 at 08:39
  • Thanks! Yes, I only specify that xml is valid or invalid. I will surely go through your suggestion and will modify. – Muhammad Haroon Iqbal Apr 06 '20 at 10:11
  • How can I get line number? If it is impossible to get line number, I better inform about the XML tag on which error is occuring. Please suggest me something about getting line number thing. – Muhammad Haroon Iqbal Apr 06 '20 at 10:28
  • have you tried the solution in the accepted answer, keep in mind your solution is valid too, the enhancement we could do for your code is to show error by the `tag name` not the line number instead of not showing anything at all, maybe simplicity is more important than all that – Basheer AL-MOMANI Apr 06 '20 at 12:09
2
/**
 * Check if the input is a valid XML file.
 * @param xmlStr The input to be parsed.
 * @returns If the input is invalid, this returns an XMLDocument explaining the problem.
 * If the input is valid, this return undefined.
 */
export function xmlIsInvalid(xmlStr : string) : HTMLElement | undefined {
  const parser = new DOMParser();
  const dom = parser.parseFromString(xmlStr, "application/xml");
  // https://developer.mozilla.org/en-US/docs/Web/API/DOMParser/parseFromString
  // says that parseFromString() will throw an error if the input is invalid.
  //
  // https://developer.mozilla.org/en-US/docs/Web/Guide/Parsing_and_serializing_XML
  // says dom.documentElement.nodeName == "parsererror" will be true of the input
  // is invalid.
  //
  // Neither of those is true when I tested it in Chrome.  Nothing is thrown.
  // If the input is "" I get:
  // dom.documentElement.nodeName returns "html", 
  // doc.documentElement.firstElementChild.nodeName returns "body" and
  // doc.documentElement.firstElementChild.firstElementChild.nodeName = "parsererror".
  //
  // It seems that the parsererror can move around.  It looks like it's trying to
  // create as much of the XML tree as it can, then it inserts parsererror whenever 
  // and wherever it gets stuck.  It sometimes generates additional XML after the
  // parsererror, so .lastElementChild might not find the problem.
  //
  // In case of an error the <parsererror> element will be an instance of
  // HTMLElement.  A valid XML document can include an element with name name
  // "parsererror", however it will NOT be an instance of HTMLElement.
  //
  // getElementsByTagName('parsererror') might be faster than querySelectorAll().
  for (const element of Array.from(dom.querySelectorAll("parsererror"))) {
    if (element instanceof HTMLElement) {
      // Found the error.
      return element;
    }
  }
  // No errors found.
  return;
}

(Technically that's TypeScript. Remove : string and : HTMLElement | undefined to make it JavaScript.)

Trade-Ideas Philip
  • 1,067
  • 12
  • 21
  • I've updated this routine since I posted here. I now return a valid parsed document or an error message. https://github.com/TradeIdeasPhilip/hungry-cat-nonogram-solver/blob/master/lib/misc.ts#L29 – Trade-Ideas Philip Feb 12 '22 at 18:52