16

Another one in the continuing saga of myself vs. Boost.Asio...

I have a simple asynchronous client and server that utilise async_write and async_read to communicate. The client can successfully write bytes to the socket, but the server never sees them; my read handler on the server fails with "Operation cancelled".

I'm inclined to believe that this may be a timing issue with the client writing the data after the server has tried to read it and failed, but I would have thought the data would be waiting on the socket anyway (unless the socket has been closed in the meantime).

To test this I simply re-ran the read operation in the error handler, i.e.

read_handler()
{
    if (!error) {
        /* bytes read */
    } else {
        async_read(socket, buffer, read_handler)
    }
}

But all this got me was a segfault in pthread_mutex_lock via a call to async_receive.

Could anyone point me in the direction of any relevant information (or, better yet, tell me exactly what I'm doing wrong ;) )?

UPDATE: The server and client are based around the chat server example in the Asio docs, with the client and server both running under the same process (could this be an issue? Thinking a bit more they both use the same io_service...); both asynchronous and using Boost 1.44.0. I'm working on OS X but this is reproducible on Linux, too.

UPDATE II: My hunch was correct and if the server and client are given separate io_service objects the async_read sees the bytes on the socket. This does still give a segfault in boost::asio::detail::kqueue_reactor::post_immediate_completion that seems to stem from the io_service.run(). Before I go any further, is using separate io_service objects the correct approach?

Alex Bitek
  • 6,529
  • 5
  • 47
  • 77
kfb
  • 6,252
  • 6
  • 40
  • 51
  • Which platform are you running on? – Len Holgate Oct 13 '10 at 11:05
  • The platform is OSX and Linux; both exhibit the same issue. – kfb Oct 13 '10 at 11:34
  • I think we need to see a bit more of your code to have a chance at guessing what you've done wrong! It'd also help to know something about your setup -- are the client and server running on the same or different machines, for example? Are both written using boost::asio? What version of Boost are you using (asio has changed quite a bit recently)? – dajames Oct 13 '10 at 13:56
  • It will take me a bit of time to censor the code down to a small example that illustrates the issue; but in the meantime in answer to your other questions, this is boost 1.44.0 with both client and server running from the same process and both bearing a remarkable similarity to the chat server example provided in Asio :) I'll update the original question with these details. – kfb Oct 13 '10 at 14:16
  • the asio chat [example](http://www.boost.org/doc/libs/1_44_0/doc/html/boost_asio/examples.html) does not have the client and server running in the same process. What are you trying to accomplish by doing that? You'll need to post more code for us to help. – Sam Miller Oct 14 '10 at 00:34
  • `if (!error) { /* bytes read */} else { async_read(socket, buffer, read_handler) }` I am wondering why you when it failed re-scheduling to read from the socket? – Vinzenz Oct 14 '10 at 00:44
  • When I have run two socket services in the same process with asio I have successfully used the same io_service object for both. The services weren't a client and server talking to each other, though, but rather two servers. – dajames Oct 14 '10 at 10:49
  • @Vinzenz Good point! I had missed that. – dajames Oct 14 '10 at 10:50
  • @Vinzenz the intention was for the handler to keep trying the read over the process' lifetime, I've since learned that that's not a good way to go about it. – kfb Oct 15 '10 at 11:40

1 Answers1

34

Operation cancelled (operation_aborted error code) is sent when the socket is closed or cancelled.

Most likely your connection is somehow going out of scope.

Perhaps as it happened to me you forgot to attach the async_handlers to a shared_from_this() pointer. I.e. You should be attaching your handlers like this:

async_read(m_socket,
           boost::asio::buffer((void*)m_buffer, m_header_size),
           boost::bind(&TcpConnection::handleRead, 
           shared_from_this(),
           boost::asio::placeholders::error,
           boost::asio::placeholders::bytes_transferred));

And NOT like this:

async_read(m_socket,
           boost::asio::buffer((void*)m_buffer, m_header_size),
           boost::bind(&TcpConnection::handleRead, 
           this, //<- This will go out of scope and the socket will be closed
           boost::asio::placeholders::error,
           boost::asio::placeholders::bytes_transferred));
Txangel
  • 659
  • 9
  • 10
  • 3
    Thanks, I had a crash in a totally different region, but also with a `Operation canceled` error message. And it was that a `this` went out of scope! Thankfully I found your answer after debugging it for hours... – Chris Dec 30 '12 at 18:36
  • that is the greatest answer of all of them. completely weird error. diagnosis absolutely spot on. – Alexander Oh Mar 02 '13 at 07:31
  • and why exactly is the connection going out of scope, when this is passed? – bryanph Aug 31 '14 at 19:45
  • 1
    `this` (a TcpConnection) is being destroyed because it goes out of scope. I had this issue years ago now but if I recall correctly the entire design was asynchronous around an event handler loop. A common policy was to minimise heap usage hence the connections were also created in the stack. When returning from the function the destructor was called, that in order destroyed `m_socket` which closed it. That was the reason why the connections were going out of scope in our case. My only advise to you @bryanph is to review the lifetime of your connections because it can be the simplest cause. – Txangel Sep 02 '14 at 14:21
  • 2
    Thanks! I find it completely disturbing that this official boost asio example seems to teach it wrong: http://www.boost.org/doc/libs/1_43_0/doc/html/boost_asio/example/http/client/async_client.cpp – jakob.j Apr 20 '15 at 17:38
  • @jakob.j Of course not! The `client` there is a single object that has automatic storage duration. Why would it "go out of scope". If it's not refcounted, there is no need to count the refs bound to a completion handler either. (IOW: different situations require different things) – sehe Feb 02 '16 at 22:41
  • In my case, this was not a solution. It is being caused because I'm calling a shutdown+close on the socket while an async_read operation is in progress. Currently trying to figure out how to deal with the exception properly at the boost level. – kevr Feb 01 '18 at 09:34