0

So I'm currently in the middle of developing a program that parses through any XML file and returns data from it in the exact order it's in. The problem I have however is that the program must be able to handle different XML files (i.e. having different elements, different structure etc). I've been able to do this using DOM as I can just make a recursive method that loops through the tree and returns the values of the nodes, but it's not the most efficient way given the size of my XML files. So instead I'm interested in using SAX Parser, however as I'm sure you're aware of, it not as easy to use as it doesn't generate a data structure in which the information can be contained in. It may seem like a silly question (only recently learned programming), but does anyone have any examples or suggestions as to how I can go about overcoming this?

Also having a look at StAX which seems to be the better way of going!

Thanks

user2062207
  • 955
  • 4
  • 18
  • 34
  • when you are trying to parse a large file SAX is not a good parser to use , look for some other parsers – thar45 Oct 28 '14 at 15:12
  • Yeah I'm currently looking at StAX atm, but I just don't know hwo I'm going to be able to recall all elements and their values in order without a predefined data structure – user2062207 Oct 28 '14 at 15:13
  • 3
    SAX provides you with data from an XML file on the fly and expects you to do something with it. It just parses a file and issues a callback for any interesting syntax objects it finds, such as start tags, end tags or character data. Since you don't seem to care about structure, this should be perfect for your use case. So your DOM recursion is already implemented in a SAX parser, you just need to react when you find something of interest. – predi Oct 28 '14 at 15:13
  • 3
    StAX is a pull parser, where you control how to proceed the parse. For example, you say: "Oh.. This XML element is useless, simply skip it entirely", while with a SAX parser, you cannot choose this flow. A SAX parser IS suited for large files. – predi Oct 28 '14 at 15:17
  • Hmm what if I wanted to create a program in which you can specify what information you want to retrieve back from an xml? Like at first I want it to specify all the elements that exists and then you specify what information you want to retrieve back, would that be best done in SAX or StAX? – user2062207 Oct 28 '14 at 16:00
  • 2
    Very hard to give advice since you haven't given any clues as to what your program actually does, and that does rather affect things. If you're not an experienced programmer then StAX is probably going to be easier to use than SAX because it puts you in control and programmers always like to be in control. – Michael Kay Oct 28 '14 at 16:37
  • To be honest I don't know exactly what I want the program to do either just yet but StAX does seem like hte better option. Thank you – user2062207 Oct 28 '14 at 17:23

0 Answers0