I've been struggling to wrap my head around how I can debug why my Sidekiq processor is stuck.
I've read the following resources:
- Dealing with Stuck Workers.
- Debugging stuck Ruby processes
- Problems and Troubleshooting · mperham/sidekiq Wiki
I attached output to all gdb traces. https://gist.github.com/Overload119/7f75d28d78e683ff3096bd9058164c9c
I took all the lines that had C extensions
#0 0x00007fd71db8400c in pthread_cond_wait@@GLIBC_2.3.2 () at /lib/x86_64-linux-gnu/libpthread.so.0
#159 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#160 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fd7193323bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00007fd713ebe5e9 in rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) () at /usr/lib/librocksdb.so.5.17
#3 0x00007fd713ebe981 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) () at /usr/lib/librocksdb.so.5.17
#4 0x00007fd719337b2f in () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#6 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#48 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#49 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#0 0x00007fd71d1aa916 in ppoll () at /lib/x86_64-linux-gnu/libc.so.6
#18 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#19 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#80 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#81 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#47 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#48 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#35 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#36 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#198 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#199 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#0 0x00007fd71d1aff59 in syscall () at /lib/x86_64-linux-gnu/libc.so.6
#7 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#8 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#0 0x00007fd71d1b57ef in epoll_wait () at /lib/x86_64-linux-gnu/libc.so.6
#9 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#10 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#11 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#12 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#37 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#38 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#8 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#9 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#0 0x00007fd71db843f9 in pthread_cond_timedwait@@GLIBC_2.3.2 () at /lib/x86_64-linux-gnu/libpthread.so.0
#53 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#54 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#124 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#125 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#20 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#21 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
#40 0x00007fd71db7dfa3 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#41 0x00007fd71d1b54cf in clone () at /lib/x86_64-linux-gnu/libc.so.6
Here I think
#1 0x00007fd7193323bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00007fd713ebe5e9 in rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) () at /usr/lib/librocksdb.so.5.17
#3 0x00007fd713ebe981 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) () at /usr/lib/librocksdb.so.5.17
looks suspicious but I'm not sure where to go from here. Is there anyway I can confirm whether or not this is holding on to a lock and causing the Sidekiq job to get stuck?