-1

In my homework I should parse some sgml files. I using SAXParser. it works correctly for a simple xml file. but when I want to parse homework sgml files this error occur:

Exception in thread "main" org.xml.sax.SAXParseException; systemId: file:///C:/Users/MarkaZ%20Computer%20RooZ/Documents/workspace/HW_02_IR/lewis.dtd; lineNumber: 2; columnNumber: 17; A '(' character or an element type is required in the declaration of element type "LEWIS".

I dont have any knowledge from dtd documents. my code is:

 SAXParserFactory parserFactor = SAXParserFactory.newInstance();
            SAXParser parser = parserFactor.newSAXParser();
            SAXHandler handler = new SAXHandler();


parser.parse(new FileInputStream("reut2-000.sgm"), handler);

How can I prevent this error?

excuse me for bad English

Hamidreza Samadi
  • 637
  • 1
  • 7
  • 24

3 Answers3

3

If you want to parse XML, use an XML parser. If you want to parse SGML, use an SGML parser (for example, James Clark's SP). Trying to parse SGML using an XML parser is like trying to compile Java with a C# compiler - it won't work.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
1

Your xml or dtd is malformed - see the end of the error message:

... A '(' character or an element type is required in the declaration of element type "LEWIS".

The error mentions lewis.dtd - perhaps that is where the fault is.

You have some options:

  1. Fix your dtd.
  2. Manually edit the xml file so it becomes well-formed.
  3. Filter the xml file before passing it to the parser - editing it on-the-fly to make it well formed.
  4. Use a different parser that is tolerant to malformed xml.
OldCurmudgeon
  • 64,482
  • 16
  • 119
  • 213
  • thank you. I think problem is in `lewis.dtd`. but I dont have any knowledge from dtd files – Hamidreza Samadi Apr 17 '15 at 10:06
  • @HamidrezaSamadi - Surely you can find `lineNumber: 2; columnNumber: 17;` in the dtd and look at the declaration of type `LEWIS`? There are [many](http://www.w3schools.com/dtd/dtd_examples.asp) examples out there. – OldCurmudgeon Apr 17 '15 at 10:27
0

You can use XMLSPY kind of tool which will validate your SGML against given xsd or dtd, if there is any error it will show in red color and then you can manually correct it.

Once its corrected you can proceed further for parsing it with SAX.

prashant thakre
  • 5,061
  • 3
  • 26
  • 39