2

I am using the following code for parsing of small xml files and it is working successfully. But when I parse huge data files I am getting a stack overflow error. So, I decided to convert this method into an iterative style. Initially when writing this method, I created the logic and wrote it successfully though when converting to an iterative style I've become completely lost and I'm not getting the required output. This is my recursive code:

private void xmlParsing(Node node,int indent) throws IOException {
    if (node.hasChildNodes()) {
        Node firstChild=node.getFirstChild();
        xmlParsing(firstChild,indent+1);
    } else {
        System.out.println(node.getNodeName()+":"+node.getNodeValue()+":"+indent);
    }
    Node nextNode=node.getNextSibling();
    if (nextNode!=null) {
        xmlParsing(nextNode,indent);
    }
}

Can someone please help me to convert this to iterative function that will perform this logic under single function? I hope I have made a clear request.

My full code:

package sample;

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.DOMException;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class NewTestClass {

    private Document doc = null;

    public NewTestClass() {
        BufferedWriter br=null;
        try {
            doc = parserXML(new File("debug.xml"));

            br=new BufferedWriter(new FileWriter("xmldata.txt"));
            xmlParsing(doc, 0,br);
        } catch(Exception error) {
            error.printStackTrace();
        } finally {
            try {
                br.flush();
                br.close();
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        }
    }
    private void xmlParsing(Node node,int indent,BufferedWriter br) throws IOException {
        if (node.hasChildNodes()) {
            Node firstChild=node.getFirstChild();
            xmlParsing(firstChild,indent+1,br);
        } else {

            br.write(node.getNodeName()+":"+node.getNodeValue()+":"+indent);
            br.newLine();
        }
        Node nextNode=node.getNextSibling();
        if (nextNode!=null) {
            xmlParsing(nextNode,indent,br);
        }
    }

    public Document parserXML(File file) throws SAXException, IOException, ParserConfigurationException
    {
        return DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(file);
    }

    public static void main(String[] args)
    {
        new NewTestClass();
    }
}

My initial error:

Exception in thread "main" java.lang.StackOverflowError
    at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeValueString(Unknown Source)
    at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeValueString(Unknown Source)
    at com.sun.org.apache.xerces.internal.dom.DeferredTextImpl.synchronizeData(Unknown Source)
    at com.sun.org.apache.xerces.internal.dom.CharacterDataImpl.getNodeValue(Unknown Source)
bobwah
  • 2,454
  • 4
  • 33
  • 49
user1119970
  • 199
  • 2
  • 8
  • 3
    You need a collection to store all the nested state. – Peter Lawrey Dec 29 '11 at 08:43
  • 2
    @PeterLawrey: Not necessarily. I'm pretty sure, the `StackOverflowError` occurs because of the recursion for siblings, which can be easily transformed into an iteration... – Lukas Eder Dec 29 '11 at 09:27

2 Answers2

3

Your problem is the fact that you recurse also for siblings, not only for children. Child recursion is perfectly OK, but in your case, recursion goes as deep as the number of (flattened) nodes (not only elements) in your document.

Do this instead:

private void xmlParsing(Node node, int indent) throws IOException {

    // iterate for siblings
    while (node != null) {

        // recurse for children
        if (node.hasChildNodes()) {
            Node firstChild = node.getFirstChild();
            xmlParsing(firstChild, indent + 1);
        } else {
            // do the leaf node action
        }

        node = node.getNextSibling();
    }
}
Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
1

I think you have a huge level of nesting of tags. can you post your sample xml file?

If I understand correctly, you are trying to transform an xml into to a text file with a certain format. If that is the requirement, I would suggest you to use XSL with XML for translation. It is very easy and also flexible.

You can find example at http://speakingjava.blogspot.com/2011/07/how-to-use-xml-and-xsl-in-java.html

Pragalathan M
  • 1,673
  • 1
  • 14
  • 19
  • In this case, moving to XSL might be overkill, as there is just a simple misconception in the algorithm: the recursion for siblings (instead of iteration over siblings) – Lukas Eder Dec 29 '11 at 09:24
  • @Pragalathan, Lukas is right..I need to work on simple things..XSL is new knowledge..and i dont think it is necessary for me..anyways, thanks for your suggestion.. – user1119970 Dec 29 '11 at 09:35