3

I'm looking for an open-source SGML parser written in plain C. This is to parse bona-fide SGML, not malformed stuff.

Any ideas?

warren
  • 32,620
  • 21
  • 85
  • 124
Adam Ernst
  • 52,440
  • 18
  • 59
  • 71

2 Answers2

5

There's OpenSP, which is part of the OpenJade project, but is implemented in C++. Might be close enough for your needs?

Alastair
  • 4,475
  • 1
  • 26
  • 23
  • 2
    SGML is complicated stuff and there are few libraries. OpenSP is a good choice. – bortzmeyer Nov 17 '08 at 18:31
  • 1
    True. Writing a full SGML parser is not for mere mortals. There's plenty of goofy stuff in ISO 8879 (the SGML standard); it's a tribute to James Clark's skill that he got so much of it right in SP (now OpenSP). – arayq2 Dec 31 '12 at 21:42
1

This came up on a fast Google search (sgml c parser): http://www.w3.org/Library/src/SGML.html. Does that help?

Or perhaps this one: http://www.math.utah.edu/pub/sgml/sgmls/

warren
  • 32,620
  • 21
  • 85
  • 124
  • I was hoping for someone with direct experience but I suppose SGML isn't exactly that common anymore :-) Thanks, I think the first one sounds like the best bet. – Adam Ernst Nov 16 '08 at 19:40
  • The "SGML parser" at W3C isn't really one. It was developed for the original WWW library (don't ask - long obsolete), and only handles a severely restricted subset of SGML syntax - basically, all that was thought to be "needed" for parsing HTML in the early days. – arayq2 Dec 31 '12 at 21:34
  • SGMLS is a true SGML parser developed by James Clark from something called the ARC-SGML Parsing Materials. He subsequently (1994) rewrote it from scratch in C++ as nsgmls (now an [open source project](http://openjade.sourceforge.net/doc/nsgmls.htm)) There are [other SGML parsers](http://xml.coverpages.org/publicSW.html) in C, such as YASP and YAO, but they're not easily found. – arayq2 Dec 31 '12 at 21:35