17

RFCs (http://www.ietf.org/rfc.html) are usually published as text files.

  • Are there any other formats, which would make parsing the RFC content easier?
  • Are there any parsers for the widely used RFC text documents?
miku
  • 181,842
  • 47
  • 306
  • 310
  • 2
    A good format would be XML. There is a RFC2629 (http://xml.resource.org/public/rfc/html/rfc2629.html) that already specifies the format. Unfortunately the published RFCs are not in XML. I started something that tries to parse text files into that RFC2629-XML but it is really tedious... – jdehaan Aug 05 '10 at 11:51
  • 2
    There is a newer draft: http://xml.resource.org/authoring/draft-mrose-writing-rfcs.html – jdehaan Aug 05 '10 at 11:54

2 Answers2

8

A limited number of RFCs are offerd as XML at http://xml.resource.org/public/rfc/xml/

Also you could merge the text data using Bib XML from http://xml.resource.org/public/rfc/bibxml/

gliptak
  • 3,592
  • 2
  • 29
  • 61
  • Using this same resource, you can find a HTML format as well http://xml.resource.org/public/rfc/html/rfc2629.html Note that this is properly formatted as HTML (in my opinion) compared to the IETF version HTML. – styfle Dec 05 '13 at 21:28
  • ^^ although this is not an exhaustive list – Duncan Jones Jan 17 '18 at 08:29
6

IETF maintains minmally-marked-up RFCs in HTML, for example:

https://www.rfc-editor.org/rfc/rfc2616.html

but the markup consists mostly of anchors to implement a table of contents; and main-body markup that is mostly <pre> ... </pre>. Nevertheless, it might be possible to do some meaningful parsing on those RFCs.

W3C has some HTMLized RFCs, for example:

http://www.w3.org/Protocols/rfc2616/rfc2616.html

in which the markup is somewhat richer in its semantics and so perhaps more amenable to parsing.

Community
  • 1
  • 1
Pete Wilson
  • 8,610
  • 6
  • 39
  • 51