How can I automate the XML Parsing using JDOM

Question

I have to parse an XML file using JDOM and get some infos from all his elements.

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <element1>something</element1>
    <element2>
        <subelement21>moo</subelement21>
        <subelement22>
            <subelement221>toto</subelement221>
            <subelement222>tata</subelement222>
        </subelement22>
    </element2>
</root>

So, for the element1 it's easy. But for the element2 I have to go through his children and if the children has children go through them too and so on.

public static void getInfos(Vector<String> files) {     
    Document document = null;
    Element root = null;

    SAXBuilder sxb = new SAXBuilder();

    for (int i =0 ; i< files.size() ; i++)
    {
        System.out.println("n°" + i + " : " + files.elementAt(i));
        try
        {
            document = sxb.build(files.elementAt(i));
            root = document.getRootElement();

            List<?> listElements = root.getChildren();
            Iterator<?> it = listElements.iterator();

            while(it.hasNext())
            {
                Element courant = (Element)it.next();
                System.out.println(courant.getName());

                if(courant.getChildren().size() > 0)
                {
                    // here is the problem -> the element has a children
                }
            }
        }
        catch (Exception e) {
            e.printStackTrace();
        }   
    }
}

What do you suggest in this case, like a recursive call or something else so I can use the same function.

Thanks.

Nathan Hughes · Answer 1 · 2011-08-16T16:35:57.340

1

I would use SAX. I'd keep a stack in the contenthandler that tracked what my current path was in the document, and keep a buffer that my characters method appended to. In endElement I'd get the content from the buffer and clear it out, then use the current path to decide what to do with it.

(this is assuming this document has no mixed-content.)

Here's a link to an article on using SAX to process complex XML documents, it expands on what I briefly described into an approach that handles recursive data structures. (It also has a predecessor article that is an introduction to SAX.)

edited Aug 16 '11 at 16:35

answered Aug 16 '11 at 15:26

Nathan Hughes

94,330
19
181
276

What do you mean by mixed content – Wassim AZIRAR Aug 16 '11 at 15:27
@OpenMind: mixed content is like what you see in html, where you have things like "asdfzxcvqwerty", so that has more than one element text node. – Nathan Hughes Aug 16 '11 at 15:33
I have a lot of this things in my xml files, so I think your method won't work for me ? – Wassim AZIRAR Aug 16 '11 at 15:34
@OpenMind: I'm not sure, hard to know without an example (your example in the question doesn't have any mixed-content). could be it just needs to track more info about what's pushing or popping off the stack. – Nathan Hughes Aug 16 '11 at 16:00

score 0 · Answer 2 · edited May 23 '17 at 11:55

0

You could consider using XPath to get the exact elements you want. The example here uses namespaces but the basic idea holds.

edited May 23 '17 at 11:55

Community

1
1

answered Aug 16 '11 at 15:56

laz

28,320
5
53
50

How can I automate the XML Parsing using JDOM

2 Answers2