0

I'm looking into extracting all the content from a DTD using Perl, but I'm not sure which is the best way to go about it. I know there are modules for working with XML, but I'm not sure if there are any for this type of work with SGML or if I should try to create a regular expression for this work?

I'm new to SGML and Perl along with not having much experience with regex, except for very simple pattern matching.

James Drinkard
  • 15,342
  • 16
  • 114
  • 137

1 Answers1

2

You have 2 options here:

  • use the old perlSGML distribution which I have used in the (remote!) past. This being perl it should still run on modern perl,

  • convert your SGML to XML using osx, which is part of openSP, available for at least Debian/Ubuntu (the package is called opensp)and most likely other platforms, then use XML tools like XML::LibXML, or XML::Twig

There are a lot more XML tools than SGML tools these days, but of course you may loose some information since DTDs are slightly simpler in XML than in SGML

Bill Ruppert
  • 8,956
  • 7
  • 27
  • 44
mirod
  • 15,923
  • 3
  • 45
  • 65