Questions tagged [sax]

SAX stands for Simple API for XML, and is an event-based way of reading XML data from a document.

SAX (Simple API for XML) is an event-based sequential access parser API developed by the XML-DEV mailing list for XML documents.
SAX provides a mechanism for reading data from an XML document that is an alternative to that provided by the Document Object Model (DOM). Where the DOM operates on the document as a whole, SAX parsers operate on each piece of the XML document sequentially.

XML processing with SAX

A parser that implements SAX (i.e., a SAX Parser) functions as a stream parser, with an event-driven API. The user defines a number of callback methods that will be called when events occur during parsing. The SAX events include (among others):

Useful references:

1784 questions
8
votes
4 answers

How to combine large XML files using MSXML SAX in Delphi

Edit: My (incomplete and very rough) XmlLite header translation is available on GitHub What is the best way to do a simple combine of massive XML documents in Delphi with MSXML without using DOM? Should I use the COM components SAXReader and…
carlmon
  • 396
  • 6
  • 20
8
votes
4 answers

SAX parser: Ignoring special characters

I'm using Xerces to parse my XML document. The issue is that XML escaped characters like   appear in characters() method as non-escaped ones. I need to get escaped characters inside characters() method as is. Thanks. UPD: Tried to override…
Alexander Oleynikov
  • 19,190
  • 11
  • 37
  • 51
8
votes
1 answer

Java (JAXP) XML parsing differences of DocumentBuilder

Is there any kind of difference between DocumentBuilder.parse(InputStream) and DocumentBuilder.parse(InputSource) ? I could only find that for the first case, the parser detects the encoding from the stream so it is safer while in the latter…
Cratylus
  • 52,998
  • 69
  • 209
  • 339
8
votes
2 answers

Is there a way to parse XML via SAX/DOM with line numbers available per node

I already have written a DOM parser for a large XML document format that contains a number of items that can be used to automatically generate Java code. This is limited to small expressions that are then merged into a dynamically generated Java…
Chris
  • 4,450
  • 3
  • 38
  • 49
8
votes
3 answers

The markup must be well-formed

First off, let me say I am a new to SAX and Java. I am trying to read information from an XML file that is not well formed. When I try to use the SAX or DOM Parser I get the following error in response: The markup in the document following the root…
Haythem
  • 417
  • 4
  • 13
  • 20
8
votes
1 answer

How to get element's value from XML using SAX parser in startElement?

Is it possible to get the content of an element from a XML file in startElement function that is the override function of the SAX handler? Below is the specification. 1) XML file
sakura
  • 199
  • 1
  • 2
  • 21
8
votes
2 answers

SaxParseException in eclipse: XML document structures must start and end within the same entity

I am using the last.fm API for JAVA which can be found here . I have a huge Dataset in which I am only using the file with user's artist history and plays. I have written a code in Java which extracts these artist names and returns the similar…
HackCode
  • 1,837
  • 6
  • 35
  • 66
8
votes
2 answers

How to handle namespaces with SAX Parser?

I'm trying to learn to parse XML documents, I have a XML document that uses namespaces so, I'm sure I need to do something to parse correctly. This is what I have: DefaultHandler handler = new DefaultHandler() { boolean bfname =…
Rafael Carrillo
  • 2,772
  • 9
  • 43
  • 64
8
votes
2 answers

SAX parsing - efficient way to get text nodes

Given this XML snippet Gambardella, Matthew In SAX, it is easy to get attribute values: @Override public void startElement (String uri, String localName, …
Eran Medan
  • 44,555
  • 61
  • 184
  • 276
8
votes
9 answers

Better way to parse xml

I've been parsing XML like this for years, and I have to admit when the number of different element becomes larger I find it a bit boring and exhausting to do, here is what I mean, sample dummy XML:
Gandalf StormCrow
  • 25,788
  • 70
  • 174
  • 263
8
votes
3 answers

Best practice: Creation of SAX parser for XMLReader

I'm using the Amazon S3 SDK in two separate wars running on the same Tomcat. I initialize an AmazonS3Client in the @PostConstruct of one of my Spring services. If I run these wars separately, everything usually works fine. If I run them together,…
Eyal
  • 3,412
  • 1
  • 44
  • 60
8
votes
2 answers

Difference SAXParserFactory XMLReaderFactory. Which one to choose?

Both of them seem to have the same purpose (create a XMLReader). Some Tutorials contain the one, some the other. SAXParserFactory: http://docs.oracle.com/javase/7/docs/api/javax/xml/parsers/SAXParserFactory.html seems to be more configurable more…
juwens
  • 3,729
  • 4
  • 31
  • 39
8
votes
9 answers

A lightweight XML parser efficient for large files?

I need to parse potentially huge XML files, so I guess this rules out DOM parsers. Is out there any good lightweight SAX parser for C++, comparable with TinyXML on footprint? The structure of XML is very simple, no advanced things like namespaces…
Alex Jenter
  • 4,324
  • 4
  • 36
  • 61
7
votes
4 answers

How to return data from a Python SAX parser?

I've been trying to parse some huge XML files that LXML won't grok, so I'm forced to parse them with xml.sax. class SpamExtractor(sax.ContentHandler): def startElement(self, name, attrs): if name == "spam": print("We found a…
Fred Foo
  • 355,277
  • 75
  • 744
  • 836
7
votes
5 answers

Using Python's xml.etree to find element start and end character offsets

I have XML data that looks like: The captial of South Africa is Pretoria. I would like to be able to extract: The XML elements as they're currently provided in etree. The full plain text of the…
Leon Derczynski
  • 542
  • 5
  • 15