1

I need to parse http-header fields:

key:value\r\n
key:value\r\n

How parse value as two iterators, indicating the beginning and the end?

  • 1
    Please provide a http://sscce.org/. This is your parser: http://www.boost.org/doc/libs/1_55_0/libs/spirit/doc/html/spirit/qi/reference/directive/raw.html – Mike M Feb 13 '14 at 15:34
  • What Mike said. Also, RFC1945 is not this simple :/ – sehe Feb 13 '14 at 22:28

1 Answers1

1

I've used this callback with libcurl's CURLOPT_HEADERFUNCTION in the past:

// userdata points to instance of response
size_t header_callback(void *data, size_t size, size_t nmemb, void *userdata)
{
    auto const length = size*nmemb;

    auto const b = static_cast<char const*>(data);
    auto f = b,
         e = b + length;

    std::string key, value;

    using namespace boost::spirit::qi;
    // 2.2 Basic Rules (rfc1945)
    static const auto& tspecials = " \t><@,;:\\\"/][?=}{:";
    static const rule<char const*, std::string()> token = +~char_(tspecials); // FIXME? should filter CTLs

    auto self = static_cast<webclient::response*>(userdata);
    if (phrase_parse(f, e, token >> ':' >> lexeme[*(char_ - eol)], space, key, value))
    {
        boost::trim(value);
        auto insertion = self->headers.insert({key, value});
        if (!insertion.second)
        {
            // merge list-valued header occurences (see rfc1945 4.2)
            insertion.first->second += ',';
            insertion.first->second.append(value);
        }
    }
    else
    {
        // roll with non seperated headers...
        std::string header(b, e);
        boost::trim(header);

        if (!header.empty())
        {
            auto insertion = self->headers.insert({header, "present"});
            logicErrAssert(insertion.second);
        }
    }

    return length;
}

Note that the header is a case-insensitive map:

/** http://www.faqs.org/rfcs/rfc1945.html 4.2  Message Headers
 *
 * Field names are case-insensitive.
 * Header fields can be extended over multiple lines by preceding each
 * extra line with at least one SP or HT, though this is not recommended.
 *
 * Multiple HTTP-header fields with the same field-name may be present
 * in a message if and only if the entire field-value for that header
 * field is defined as a comma-separated list [i.e., #(values)]. It
 * must be possible to combine the multiple header fields into one
 * "field-name: field-value" pair, without changing the semantics of
 * the message, by appending each subsequent field-value to the first,
 * each separated by a comma.
 */

using ::misc::text_utils::ci_lookup;
typedef ci_lookup<std::string> message_headers_t;
sehe
  • 374,641
  • 47
  • 450
  • 633
  • You know how parse key in iterator_range? And get access to it by some key? – user3306504 Feb 14 '14 at 10:01
  • Mike showed you the documentation for `qi::raw`. This is the solution. I realize this answer doesn't rreally answer your question in the narrow sense. I'm happy to delete it if you prefer. – sehe Feb 14 '14 at 10:03
  • I have string "content-length" first and last iterator to it and value. How get value by two iterators in key? – user3306504 Feb 14 '14 at 10:24
  • I parse two iterators. How create lower-case string between it two iterators. on_success? – user3306504 Feb 14 '14 at 10:57
  • I don't know what `on_success` you are referring. For lower-casing, consider postprocessing or use something similar to this: http://paste.ubuntu.com/6930606/ (I don't see how/why you would convert your input to lower-case while returning the source iterators...? This would require mutable iterators and sounds like a bad idea in the light of non-fixed width char encodings) – sehe Feb 14 '14 at 11:13