2

I am using the following code, taken from boost tutorials, to get a json string from the server.

The problem is that it takes some time to execute, ie more than 2 seconds to finish and both client and server are on localhost. If I remove the last 2 lines of the program, ie this while:

while (boost::asio::read(socket, response, boost::asio::transfer_at_least(1), error))

the program executes extremely fast. What might the problem be?

        boost::asio::streambuf response;
        boost::asio::read_until(socket, response, "\r\n");

        std::istream response_stream(&response);
        std::string http_version;
        response_stream >> http_version;
        unsigned int status_code;
        response_stream >> status_code;
        std::string status_message;
        std::getline(response_stream, status_message);
        if (!response_stream || http_version.substr(0, 5) != "HTTP/")
        {
          std::cout << "Invalid response\n";
          return 1;
        }
        if (status_code != 200)
        {
          std::cout << "Response returned with status code " << status_code << "\n";
          return 1;
        }


        boost::asio::read_until(socket, response, "\r\n\r\n");

        // Process the response headers.
        std::string header;
        while (std::getline(response_stream, header) && header != "\r");


        if (response.size() > 0)
          std::cout << &response;

        // Read until EOF, writing data to output as we go.
        boost::system::error_code error;
        while (boost::asio::read(socket, response,
              boost::asio::transfer_at_least(1), error))
          std::cout << &response;
        if (error != boost::asio::error::eof)
          throw boost::system::system_error(error);

a tcpdump to show some data from the server

HTTP/1.1 200 OK
Connection: close
Content-Length: 42
Server: C/1.1
Date: Thu, 24 Nov 2016 07:47:27 GMT

{"Out":[1],"In":[1,2,3,4,5,6]}
Arunmu
  • 6,837
  • 1
  • 24
  • 46
cateof
  • 6,608
  • 25
  • 79
  • 153
  • yes, it is really slow. I believe that the issue is related with the content length. If I modify the code to read exactly the size of the content length in the response, the program runs fast – cateof Nov 24 '16 at 08:38
  • @Arunmu I don't have access to the server code. It does not send content length and I can see from the header that the message is chunked. However the above code does NOT work well also for a a server that sends connection close and content length. – cateof Nov 24 '16 at 08:58
  • 2
    There you are..now you are mentioning the important part that the response is chunked :) There is a chunk size written (in hex I guess) before the start of every chunk. You can decode that to figure out how much to read. You could have saved a lot of time by just mentioning that. – Arunmu Nov 24 '16 at 09:10
  • 1
    OK, after so much debbuging at least I learned something. You are rigth I have seen a hex. So any example around on how to do this? – cateof Nov 24 '16 at 09:15
  • 1
    I have provided an implementation – Arunmu Nov 24 '16 at 10:16
  • great, thanks for resolving this. – cateof Nov 24 '16 at 10:39

2 Answers2

3

From the discussion in the comments it was understood that the main problem is with reading the chunked data. For HTTP chunk encoded data, the size is prefixed in hex before the chunk data begins. Therefore, one has to read the size which is the content length for that chunk.

      asio::streambuf response;
      // Get till all the headers
      asio::read_until(socket, response, "\r\n\r\n");

      // Check that response is OK. 
      std::istream response_stream(&response);
      std::string http_version;
      response_stream >> http_version;
      std::cout << "Version : " << http_version << std::endl;

      unsigned int status_code;
      response_stream >> status_code;

      std::string status_message;
      std::getline(response_stream, status_message);

      if (!response_stream || http_version.substr(0, 5) != "HTTP/") {
        std::cerr << "invalid response";
        return -1; 
      }

      if (status_code != 200) {
        std::cerr << "response did not returned 200 but " << status_code;
        return -1; 
      }

      //read the headers.
      std::string header;
      while (std::getline(response_stream, header) && header != "\r") {
        std::cout << "H: " << header << std::endl;
      }

      bool chunk_size_present = false;
      std::string chunk_siz_str;

      // Ignore the remaining additional '\r\n' after the header
      std::getline(response_stream, header);

      // Read the Chunk size
      asio::read_until(socket, response, "\r\n");
      std::getline(response_stream, chunk_siz_str);
      std::cout << "CS : " << chunk_siz_str << std::endl;
      size_t chunk_size = (int)strtol(chunk_siz_str.c_str(), nullptr, 16);


      // Now how many bytes yet to read from the socket ?
      // response might have some additional data still with it
      // after the last `read_until`
      auto chunk_bytes_to_read = chunk_size - response.size();

      std::cout << "Chunk Length = " << chunk_size << std::endl;
      std::cout << "Additional bytes to read: " << response_stream.gcount() << std::endl;

      std::error_code error;
      size_t n = asio::read(socket, response, asio::transfer_exactly(chunk_bytes_to_read), error);

      if (error) {
        return -1; //throw boost::system::system_error(error);
      }

      std::ostringstream ostringstream_content;
      ostringstream_content << &response;

      auto str_response = ostringstream_content.str();
      std::cout << str_response << std::endl;

A bit tricky part to understand is that asio::read_until guarantees that it reads the data upto the provided pattern, but it also can read more data into the buffer.

Arunmu
  • 6,837
  • 1
  • 24
  • 46
1

The only "EOF" in HTTP is when the TCP connection is closed. In this case you are lucky that the server is timing out after only 2 seconds before closing the connection - otherwise, your app would sit around even longer.

You need to use the Content-Length value to know how much data to read, rather than looking for an EOF condition.

Google "HTTP pipelining" for an understanding of why the server isn't closing the TCP connection when you expect it to.

Allison Lock
  • 2,375
  • 15
  • 17
  • Indeed reading the content length did the job. However what happens if the server does not respond with Content-Lengh and we have a chunked message? Does boost offers solution on this? – cateof Nov 24 '16 at 08:51
  • Boost ASIO doesn't (it works on a level lower than HTTP), so you would either need to handle the `Transfer-Encoding` header _and the chunked content format_ yourself, or use a 3rd-party HTTP library such as the [unofficial Boost.Http](https://boostgsoc14.github.io/boost.http/) – Allison Lock Nov 24 '16 at 09:04