0

Yesterday we had crash of PostgreSQL 9.5.14 running on Debian 8 (Linux xxxxxx 3.16.0-7-amd64 #1 SMP Debian 3.16.59-1 (2018-10-03) x86_64 GNU/Linux) - Segmentation fault. Database closed all connections and reinitialized itself staying ~1 minute in recovery mode.

PostgreSQL log:

2018-10-xx xx:xx:xx UTC [580-2] LOG: server process (PID 16461) was terminated by signal 11: Segmentation fault

kern.log:

Oct xx xx:xx:xx xxxxxxxx kernel: [117977.301353] postgres[16461]: segfault at 7efd3237db90 ip 00007efd3237db90 sp 00007ffd26826678 error 15 in libc-2.19.so[7efd322a2000+1a1000]

According to libc documentation (https://support.novell.com/docs/Tids/Solutions/10100304.html) error code 15 means: NX_EDEADLK 15 resource deadlock would occur - which does not tell me much.

Could you tell me please if we can do something to avoid this problem in the future? Because this server is of course production one. All packages are up to date currently. Upgrade of PG is unfortunately not the option. Server runs on Google Compute Engine.

JosMac
  • 2,164
  • 1
  • 17
  • 23

1 Answers1

1

error code 15 means: NX_EDEADLK 15

No, it doesn't mean that. This answer explains how to interpret 15 here.

It's bits 0, 1, 2, 3 set => protection fault, write access, user mode, use of reserved bit. Most likely your postgress process attempted to write to some wild pointer.

if we can do something to avoid this problem in the future?

The only thing you can do is find the bug and fix it, or upgrade to a release of postgress where that bug is already fixed (and hope that no new ones were introduced).

To understand where the bug might be, you should check whether a core dump was produced (if not, do enable them). If you have the core, use gdb /path/to/postgress /path/to/core, and then where GDB command. That will give you crash stack trace, which may allow you to find similar bug reports.

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
  • You are absolutely right - I did some digging over weekend and ask a lot of people and they got me similar answer. Over weekend we had 2 more crushes so we will migrate to the completely new instance with latest Debian 9 and latest PostgreSQL. – JosMac Oct 22 '18 at 06:35