1

My call stack shows the following:

 --- called from signal handler with signal 10 (SIGBUS) ---
 001301b8 allocate__t24__default_alloc_template2b0i0Ui (20, 20, 309940, 36, fc55
1a00, 0) + a4
 0011dcb8 __nw__Q2t12basic_string3ZcZt18string_char_traits1ZcZt24__default_alloc
_template2b0i0_3RepUiUi (10, 10, 7773e8, 0, 0, 0) + 14
 0011dcf8 create__Q2t12basic_string3ZcZt18string_char_traits1ZcZt24__default_all
oc_template2b0i0_3RepUi (a, a, 7773e8, a, 0, 0) + 24
 0011e0bc replace__t12basic_string3ZcZt18string_char_traits1ZcZt24__default_allo
c_template2b0i0UiUiPCcUi (fbcff5c0, 0, ffffffff, fcbf55e2, a, 80808080) + 114
 00133ef0 assign__t12basic_string3ZcZt18string_char_traits1ZcZt24__default_alloc
_template2b0i0PCcUi (fbcff5c0, fcbf55e2, a, ffffffff, ffffffff, 20) + 24
 00132c78 assign__t12basic_string3ZcZt18string_char_traits1ZcZt24__default_alloc
_template2b0i0PCc (fbcff5c0, fcbf55e2, 15b0, 15d0, 16f0, 0) + 24
 0012f970 __t12basic_string3ZcZt18string_char_traits1ZcZt24__default_alloc_templ
ate2b0i0PCc (fbcff5c0, fcbf55e2, fcbf55d8, fbcff70e, 10, e00) + 28
 001f7e0c getFiles__7ListDirb (fbcff8e4, 0, 241000, 0, 4e61a0, ff11f478) + 144
. . .

Does that mean allocation failing means too much memory has been occupied? How can I check/monitor memory usage grow and shrink to find out where lies the problem in such cases? May I override allocate__t24__default_alloc_template2b0i0Ui i.e. __default_alloc_template<false, 0>::allocate(unsigned int) so that it calls custom allocate call?

Dr. Debasish Jana
  • 6,980
  • 4
  • 30
  • 69
  • As with just about any crash, you should use a *debugger* to catch it, and go up the call-stack until you get to your code, where you examine all involved variables to see if they are correct. Knowing where the crash happens in *your* code is always useful. I also recommend you read [How to debug small programs](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/) by Eric Lippert. – Some programmer dude Aug 25 '17 at 11:54
  • This problem comes after heavy volume of data handling, could this be because of unallocated memories? – Dr. Debasish Jana Aug 25 '17 at 12:05
  • You might want to [read more about bus errors](https://en.wikipedia.org/wiki/Bus_error). – Some programmer dude Aug 25 '17 at 12:06
  • 3
    It means you have a bug in your code. – Sam Varshavchik Aug 25 '17 at 12:25
  • 1
    And assuming you're running Solaris on SPARC, `SIGBUS` is almost certainly caused by a misaligned memory access. That can be caused by heap corruption, which is usually "found" by `libc` code under `malloc()`/`free()`/ *et al*, stack overflow causing the return address to be corrupted to an invalid address in such a way as to cause a `SIGBUS` instead of a `SIGSEGV` (rare, but it happens), or simply violating strict aliasing in your own code. – Andrew Henle Aug 25 '17 at 13:57

1 Answers1

3

call stack shows SIGBUS, what does that mean

It would probably be helpful to show the top of the call stack so we can inspect alignment of pointers. It would also be helpful to know the platform and the instruction that caused the SIGBUS.

Its been my experience SIGBUS is often related to unaligned data. Before you go down a rabbit hole, try adding -xmemalign=4i or -xmemalign=8i to CFLAGS and CXXFLAGS.

I seem to recall Sparc's have an instruction that can operate more efficiently on wider data but its very sensitive to alignment. If you cast a uint8_t* to a uint32_t* or uint64_t*, then that buffer really needs to be aligned because SunCC will generate the more efficient move by default. This is the strict aliasing violation Andre speaks of. Sun is not like x86, and it will also SIGBUS if you cheated.

Also see B.2.111 -xmemalign=ab in the Sun manual. There are also a lot of good hits for Google "-xmemalign=4i". The rub is, until you suffer the problem and get to the bottom of it, you don't know that's what you need to search for.

(I spent months chasing one crash on a Sparc in a self test and it was due to a dirty cast and the wider move instruction. -xmemalign=4i fixed it for me).

jww
  • 97,681
  • 90
  • 411
  • 885
  • 1
    Note that you should fix the memory alignment problem not use the compiler flag workaround if it is performance critical code as the alignment is there for a good reason. – cb88 Aug 31 '17 at 15:17
  • @cb88 - yes, agreed. For us, it was caused by public domain code that cast an array of 32-bit words to a 64-bit word pointer. I did not want to rewrite the routine, which was SHA-1 code. But in my heart I know the code is wrong :( – jww Aug 31 '17 at 15:20
  • It's interesting to hear of actual examples of this as well... seems an odd thing to do casting an array to a pointer of a different type... since arrays already are pointers effectively. – cb88 Aug 31 '17 at 15:29