0

I am using the C++ Rest SDK ("Casablanca") to receive feed from Websocket-Servers. Currently, I have three different connections to three different servers running at the same time using the websocket_callback_client class.

The program runs for an undefined time and then suddenly receives SIGTRAP, Trace/ Breakpoint trap. This is the output of GDB:

#0  0x00007ffff5abec37 in __GI_raise (sig=5) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x000000000047bb8e in pplx::details::_ExceptionHolder::~_ExceptionHolder() ()
#2  0x000000000044be29 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() ()
#3  0x000000000047fa39 in pplx::details::_Task_impl<unsigned char>::~_Task_impl() ()
#4  0x000000000044be29 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() ()
#5  0x00007ffff6feb09f in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fffc8021420, __in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:546
#6  0x00007ffff6fffa38 in std::__shared_ptr<pplx::details::_Task_impl<unsigned char>, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fffc8021418, __in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:781
#7  0x00007ffff6fffa52 in std::shared_ptr<pplx::details::_Task_impl<unsigned char> >::~shared_ptr (this=0x7fffc8021418, __in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr.h:93
#8  0x00007ffff710f766 in pplx::details::_PPLTaskHandle<unsigned char, pplx::task<unsigned char>::_InitialTaskHandle<void, void web::websockets::client::details::wspp_callback_client::shutdown_wspp_impl<websocketpp::config::asio_tls_client>(std::weak_ptr<void> const&, bool)::{lambda()#1}, pplx::details::_TypeSelectorNoAsync>, pplx::details::_TaskProcHandle>::~_PPLTaskHandle() (this=0x7fffc8021410, __in_chrg=<optimized out>)
    at /home/cpprestsdk/Release/include/pplx/pplxtasks.h:1631
#9  0x00007ffff716e6f2 in pplx::task<unsigned char>::_InitialTaskHandle<void, void web::websockets::client::details::wspp_callback_client::shutdown_wspp_impl<websocketpp::config::asio_tls_client>(std::weak_ptr<void> const&, bool)::{lambda()#1}, pplx::details::_TypeSelectorNoAsync>::~_InitialTaskHandle() (this=0x7fffc8021410, __in_chrg=<optimized out>) at /home/cpprestsdk/Release/include/pplx/pplxtasks.h:3710
#10 0x00007ffff716e722 in pplx::task<unsigned char>::_InitialTaskHandle<void, void web::websockets::client::details::wspp_callback_client::shutdown_wspp_impl<websocketpp::config::asio_tls_client>(std::weak_ptr<void> const&, bool)::{lambda()#1}, pplx::details::_TypeSelectorNoAsync>::~_InitialTaskHandle() (this=0x7fffc8021410, __in_chrg=<optimized out>) at /home/cpprestsdk/Release/include/pplx/pplxtasks.h:3710
#11 0x00007ffff71f9cdd in boost::_bi::list1<boost::_bi::value<void*> >::operator()<void (*)(void*), boost::_bi::list0> (this=0x7fffdc7d7d28, f=@0x7fffdc7d7d20: 0x479180 <pplx::details::_TaskProcHandle::_RunChoreBridge(void*)>, a=...)
    at /usr/local/include/boost/bind/bind.hpp:259
#12 0x00007ffff71f9c8f in boost::_bi::bind_t<void, void (*)(void*), boost::_bi::list1<boost::_bi::value<void*> > >::operator() (this=0x7fffdc7d7d20) at /usr/local/include/boost/bind/bind.hpp:1222
#13 0x00007ffff71f9c54 in boost::asio::asio_handler_invoke<boost::_bi::bind_t<void, void (*)(void*), boost::_bi::list1<boost::_bi::value<void*> > > > (function=...) at /usr/local/include/boost/asio/handler_invoke_hook.hpp:69
#14 0x00007ffff71f9bea in boost_asio_handler_invoke_helpers::invoke<boost::_bi::bind_t<void, void (*)(void*), boost::_bi::list1<boost::_bi::value<void*> > >, boost::_bi::bind_t<void, void (*)(void*), boost::_bi::list1<boost::_bi::value<void*> > > > (function=..., context=...) at /usr/local/include/boost/asio/detail/handler_invoke_helpers.hpp:37
#15 0x00007ffff71f9b2e in boost::asio::detail::completion_handler<boost::_bi::bind_t<void, void (*)(void*), boost::_bi::list1<boost::_bi::value<void*> > > >::do_complete (owner=0x7488d0, base=0x7fffc801ecd0)
    at /usr/local/include/boost/asio/detail/completion_handler.hpp:68
#16 0x00000000004c34c1 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
#17 0x00007ffff709fb27 in boost::asio::io_service::run (this=0x7ffff759ab78 <crossplat::threadpool::shared_instance()::s_shared+24>) at /usr/local/include/boost/asio/impl/io_service.ipp:59
#18 0x00007ffff7185a81 in crossplat::threadpool::thread_start (arg=0x7ffff759ab60 <crossplat::threadpool::shared_instance()::s_shared>) at /home/cpprestsdk/Release/include/pplx/threadpool.h:133
#19 0x00007ffff566e184 in start_thread (arg=0x7fffdc7d8700) at pthread_create.c:312
#20 0x00007ffff5b8237d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

At line #18 the soruce /pplx/threadpool.h:133 is given. This is the source-code around these lines:

  123     static void* thread_start(void *arg)
  124     {
  125 #if (defined(ANDROID) || defined(__ANDROID__))
  126         // Calling get_jvm_env() here forces the thread to be attached.
  127         get_jvm_env();
  128         pthread_cleanup_push(detach_from_java, nullptr);
  129 #endif
  130         threadpool* _this = reinterpret_cast<threadpool*>(arg);
  131         try
  132         {
  133             _this->m_service.run();
  134         }
  135         catch (const _cancel_thread&)
  136         {
  137             // thread was cancelled
  138         }
  139         catch (...)
  140         {
  141             // Something bad happened
  142 #if (defined(ANDROID) || defined(__ANDROID__))
  143             // Reach into the depths of the 'droid!
  144             // NOTE: Uses internals of the bionic library
  145             // Written against android ndk r9d, 7/26/2014
  146             __pthread_cleanup_pop(&__cleanup, true);
  147             throw;
  148 #endif
  149         }
  150 #if (defined(ANDROID) || defined(__ANDROID__))
  151         pthread_cleanup_pop(true);
  152 #endif
  153         return arg;
  154     }

For clarification, m_service is a boost::asio::io_service. To me it looks like line #133 throws an exception, it gets caught at line #139 and then rethrown. At this point, I have to catch it personally, because if I don't and the pplx-object gets destroyed with an uncaught exception, it will raise SIGTRAP.

This is how far I got with my research. The problem is I do not have a clue where this is happening. I have surrounded every position where data is sent through or received from websocket_callback_client with try {} catch(...){} and it is still happening.

Maybe someone who has used this library before can help me out.

Bobface
  • 2,782
  • 4
  • 24
  • 61
  • I'm getting the exact same error, did you manage to fix this in the end? – Victor.dMdB Feb 04 '17 at 04:11
  • It was a pretty long time ago, but I think it was because I tried to send data through a closed socket. – Bobface Feb 04 '17 at 12:29
  • This seems like one of the horrifying things about this particular SDK. They use their thread pool, but it they don't surface exceptions easily, so you end up with SIGTRAPs. I've found that with their tasks, if you use the `.then()`, then your exceptions get uncaught. You need to get the task and then do a `.get()` which then destroys any form of concurrency, since you're now blocked on the get. – Mobile Ben Apr 27 '17 at 20:58

2 Answers2

1

In my experience this happens due to a separate issue.
When the websocket_callback_client's close handler gets called, most people try to delete the websocket_callback_client. This internally calls the close function.
When this happens the websocket_callback_client will wait for the close to finish. If another thread realizes the connection is dead and tries to clean up you will have the same object being deleted from 2 different locations, which will cause major issues.
Howto reconnect to a server which does not answer to close() has a fairly thorough review of what happens when cpprestsdk calls close.

Hope this helps :)

Edit: As it turns out (the response I gave in the linked question has this), if you try to close or delete the websocket_callback_client from the close handler it will itself call the close handler, which will lock the thread.
The solution I have found that works best for me is to set a flag in the close handler and handle the cleanup in the main thread or at the very least an alternate thread.

Community
  • 1
  • 1
Aleksandr
  • 96
  • 7
1

Revisiting this. I've found a work around, which I've posted on the cpprestsdk github (https://github.com/Microsoft/cpprestsdk/issues/427).

The SDK does a poor job in surfacing exceptions and in the issue I've indicated they need to improve the documentation around this as well as provide a clean public interface to do this (you'll see the solution has a code smell to it).

What one needs to do is to rethrow the user exception.

This is in the context of making an http_client request call, but should be applicable for any usage of pplx.

client->request(request).then([=] (web::http::http_response response) mutable {
    // Your code here
}).then([=] (pplx::task<void> previous_task) mutable {
    if (previous_task._GetImpl()->_HasUserException()) {
        auto holder = previous_task._GetImpl()->_GetExceptionHolder(); // Probably should put in try

        try {
            // Need to make sure you try/catch here, as _RethrowUserException can throw
            holder->_RethrowUserException();
        } catch (std::exception& e) {
            // Do what you need to do here
        }
    }
});

The handling to catch that "unobserved exception" is done in the second then().

Mobile Ben
  • 7,121
  • 1
  • 27
  • 43