0

Is there a way to get the first tag in an xml file and make sure it has a corresponding closing tag using the sax parser?

Podge
  • 236
  • 1
  • 4
  • 11

4 Answers4

1

Just handle endDocument, if that is called then it is well formed.

Rocky Pulley
  • 22,531
  • 20
  • 68
  • 106
1

You can either handle startElement(), endElement() and endDocument(), or just handle endDocument(). endDocument() should throw an exception if the document is not well-formed. However, for the sake of learning I will show a few examples:

public class MyParser extends DefaultHandler {

    private String firstElement;
    private String lastElement;

    public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException {
        if(firstElement == null) {
            firstElement = name;
        } 
    }
    public void endElement(String uri, String localName, String name) throws SAXException {
        lastElement = name;
    }
    public void endDocument() {
        if(lastElement.equals(firstElement)) {
            // Well formed input
        }
    }
}

You can also ensure all elements are closed with a stack:

public class MyParser extends DefaultHandler {
    Stack<String> stk;

    //...

    public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException {
       stk.push(name);
    }  

     public void endElement(String uri, String localName, String name) throws SAXException  {
         if(stk.pop().equals(name)) {
            // Input is well formed for each tag
         }
         else {
            // Not well-formed
         }
     }
}
Chris Dargis
  • 5,891
  • 4
  • 39
  • 63
  • yeah but i don't know the start element as it can be one of a hundred...but i need to know the first element on the xml file opens and closes – Podge Jul 24 '12 at 15:13
1

This sounds more like you want to use DOM parsing.

If you use sax parsing, you are actually saying you do not want to process (load in memory) the entire document at once. If you search for the first tags end (the root tag), You are scanning the entire document at once, and loose the benefit of SAX.

The DOM parses will also throw when you load the document and it is not well-formed. So no need to manually check whether the root tag was closed.

W. Goeman
  • 1,674
  • 2
  • 15
  • 31
  • i was thinking dom would be more appropuiate now....however the effort in changing it is too much cause i have the sax parser doing so much as it stands....when is endDocument() run? is it when the document actually ends or the closing tag of the first tag is read? – Podge Jul 24 '12 at 15:17
  • I can not assure you about the workings of SAX for Java, I only used SAX in other languages and DOM in Java. But I would assume that endDocument should be fired when the root tag closes. Not when you reach an end-of-file. – W. Goeman Jul 24 '12 at 15:19
  • yeah your right and if the closing tag isn't present it will throw an exception – Podge Jul 24 '12 at 15:34
0

The SAX parser actually throws an exception if any tag doesn't open or close so no handling of it is needed....so if the XML file is wrong/corrupted the SAX parser will throw the exception

bdoughan
  • 147,609
  • 23
  • 300
  • 400
Podge
  • 236
  • 1
  • 4
  • 11