5

I have some documents that are structured using SGML, and I have a DTD file which describes this structure.

Can someone recommend a Python-3 compatible library or module to me to parse this data? For Python 2.x my Google-fu seems to turn up SGMLParser, but that of course is now deprecated (and outright removed from Py3k).

Many seem to suggest lxml, but that is not an option for me due to dependency issues.

I know BeautifulSoup is great for messy markup, but A) last I heard it wasn't py3k compatible, and B) this content is well-structured.

Adam Parkin
  • 17,891
  • 17
  • 66
  • 87

1 Answers1

3

BeautifulSoup is deprecated. Use the replacement instead, which is Py3k compatible:

Paul Sweatte
  • 24,148
  • 7
  • 127
  • 265