13

I've been looking around and am quite surprised that there seems to be no means by which one can parse headers generically in libcurl (which seems to be the canonical C library for http these days).

The closest thing I've found was a mailing list post where someone suggested someone else search through the mailing list archives.

The only facility that is provided by libcurl via setopt is CURLOPT_HEADERFUNCTION which will feed the header responses a single line at a time.

This seems entirely too primitive considering headers can span multiple lines. Ideally this should be done once correctly (preferably by the library itself) and not left for the application developers to do continually reinvent.

Edit:

An example of the naïve thing not working, see the following gist with a libcurl code example and a properly formed http response that it can't parse: https://gist.github.com/762954

Dustin
  • 89,080
  • 21
  • 111
  • 133
  • I'm right with you. The unfortunate thing is that libcurl also seems to do some processing of header lines even if no `CURLOPT_HEADERFUNCTION` is provided and `CURLOPT_HEADER` is set to true. i.e. it's doing a lot of lexing that's fairly useless. – jberryman May 25 '15 at 10:06

2 Answers2

14

Been over a year, so I think I'll close this as "manually." Or:

If you're having cURL problems, I feel bad for you son,

You've got multi-line headers and must parse each one.

Dustin
  • 89,080
  • 21
  • 111
  • 133
10

libcurl reads each HTTP header and sends it as a single complete line to the header callback.

"Continued" HTTP header lines are not allowed in the HTTP 1.1 RFC 7230 family, and they were virtually extinct even before that but they are also sent to the callback.

Header API

Since libcurl 7.84.0, it provides an easy-to-use API to access all and any response headers from a previous transfer. See curl_easy_header for accessing a specific one, or use curl_easy_nextheader if you want to iterate over them.

The header API supports "continued" lines and it also works identically no matter which HTTP version the response comes over.

Daniel Stenberg
  • 54,736
  • 17
  • 146
  • 222
  • 4
    It may be the case that multi-line headers are unusual, but it's valid and I'd rather not just have an application entirely break if it did run into it. – Dustin Jan 03 '11 at 00:21
  • 1
    They are not allowed in HTTPbis, which is the pending update to the HTTP spec... adding support today for that seems utterly pointless to me. But sure go ahead if that's your game. – Daniel Stenberg Jan 11 '11 at 21:29
  • "libcurl reads the entire header and sends it as a single complete line to the callback." This isn't correct as I read it. libcurl seems to send a line at a time (including line-break characters, and including the final `\r\n`) to the callback until it reaches the body section. As @Dustin points out this is rather useless if you don't have full control over the servers your client is getting responses from. – jberryman May 25 '15 at 10:19