0

I am seeing some strange behaviour in glibc. The code had a bug where it would pass a random pointer to fclose(). I would have expect it to crash at this point, but instead it hangs in pthread_once(), with the below backtrace. The program does not use any threading.

#0  0x000000318180ca38 in pthread_once () from /lib64/libpthread.so.0
#1  0x0000003181109d1c in backtrace () from /lib64/libc.so.6
#2  0x0000003181075d34 in __libc_message () from /lib64/libc.so.6
#3  0x000000318107c6fc in malloc_consolidate () from /lib64/libc.so.6
#4  0x000000318107d719 in _int_malloc () from /lib64/libc.so.6
#5  0x0000003181080a4a in calloc () from /lib64/libc.so.6
#6  0x0000003180c0b0df in _dl_new_object () from /lib64/ld-linux-x86-64.so.2
#7  0x0000003180c061ac in _dl_map_object_from_fd () from /lib64/ld-linux-x86-64.so.2
#8  0x0000003180c08563 in _dl_map_object () from /lib64/ld-linux-x86-64.so.2
#9  0x0000003180c13861 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
#10 0x0000003180c0f304 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#11 0x0000003180c131eb in _dl_open () from /lib64/ld-linux-x86-64.so.2
#12 0x00000031811305d2 in do_dlopen () from /lib64/libc.so.6
#13 0x0000003180c0f304 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#14 0x0000003181130692 in __libc_dlopen_mode () from /lib64/libc.so.6
#15 0x0000003181109c05 in init () from /lib64/libc.so.6
#16 0x000000318180ca40 in pthread_once () from /lib64/libpthread.so.0
#17 0x0000003181109d1c in backtrace () from /lib64/libc.so.6
#18 0x0000003181075d34 in __libc_message () from /lib64/libc.so.6
#19 0x000000318107d0b8 in _int_free () from /lib64/libc.so.6
#20 0x000000318106ba6d in fclose@@GLIBC_2.2.5 () from /lib64/libc.so.6

This is on Fedora 19 with glibc-2.17-20.fc19.x86_64, and the program is started from systemd with StandardError=null, so there's no place for __libc_message() to output an error message to.

I've fixed the code, but is that hang a glibc bug or what?

nafmo
  • 448
  • 4
  • 19

2 Answers2

3

Of course it's not a glibc bug: you're breaking the rules and can get whatever behavior happens to happen. The manual page says:

In either case any further access (including another call to fclose()) to the stream results in undefined behavior.

It's formally wrong to have some form of expectation as to what's going to happen when triggering undefined behavior. The behavior is undefined, that means nobody gets to say what's "right" and what's "wrong".

As to exactly why this happens, that's basically only of interest to glibc implementors. That said, this answer hints at an explanation: fclose() is thread-safe, so it expects the FILE to contain a mutex. You're passing a de-allocated structure, causing the library to use random crap data as a mutex, which locks up. Pretty reasonable.

Community
  • 1
  • 1
unwind
  • 391,730
  • 64
  • 469
  • 606
1

I'm checking similar backtrace in these days, and I believe this is a glibc bug, which has been there for at least 7 years since Ubuntu 8.04.

Basically this happens after a memory corruption occurs, and, unfortunately, __libc_message allocates memory itself. Since the heap has corrupted, it trys to do backtrace again. Finally it results a deadlock in pthread_once().

=EDITED= I found a tracker for this issue, but it seems to be fixed only in master branch. https://sourceware.org/bugzilla/show_bug.cgi?id=16159