3

I want to retrieve the executable from a core dump and the output of any linux package used to get this information should contain execfn in it's output.

Here are the following things which I have tried so far :

$ file kms
kms: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/test', real uid: 1000440000, effective uid: 1000440000, real gid: 0, effective gid: 0, execfn: '/test', platform: 'x86_64'

The file command only works for specific cores and it's not a generic solution because some core dump gives following output.

$ file ss
ss: ELF 64-bit LSB core file x86-64, version 1 (SYSV), too many program header sections (6841)

gdb command doesn't work for all core dumps in the same manner. The output using gdb command is inconsistent. The output received by gdb command for some core dump is not the same as strings command.

$gdb kms
Core was generated by `/test'.

I even tried strings package and I think it gives proper output but the format doesn't contain execfn for it to be used in my solution

$ strings kms | grep ^/ | tail -1
/test

Can anyone please suggest any linux package which will help me in retrieving executable from core dump which contains execfn in it's output.

kms
  • 153
  • 2
  • 9
  • Did you ask the system administrator of your Linux computer? – Basile Starynkevitch Jun 14 '20 at 14:49
  • Provide some [mre] in your question please and give much more details. – Basile Starynkevitch Jun 14 '20 at 14:55
  • Create a standard core dump using `kill -11 PID`. The PID can be of a docker container which we are suppose to crash. But the weird thing is that the above solution using `file` package works for some cores and not for others. Thanks @BasileStarynkevitch for your inputs. – kms Jun 18 '20 at 00:40
  • Does this answer your question? [Find which program caused a core dump file](https://stackoverflow.com/questions/13308922/find-which-program-caused-a-core-dump-file) – Victor Sergienko Aug 08 '23 at 22:31
  • Are you sure that the limit (as set by [setrlimit(2)](https://man7.org/linux/man-pages/man2/setrlimit.2.html) with `RLIMIT_CORE`...) is large enough? Consider adding a (tested) call to `setrlimit` appropriately in your executable source code. – Basile Starynkevitch Aug 09 '23 at 06:14
  • The question needs detail. What is your executable doing? Is is a command line program running for a short time (a few seconds) or a daemon or service running for days of elapsed real time, or hours of CPU time? Is the core dump easy to reproduce? – Basile Starynkevitch Aug 09 '23 at 07:53

1 Answers1

3

Try running the file(1) command on your core(5) file. But that requires your core file to be complete. See below and gcore(1) with strace(1) and ptrace(2).

If your ELF executable (see elf(5)) was built with DWARF debugging information then you should have enough information in your core file. See also gdb(1) and this answer.

DWARF debugging information is obtained by compiling and linking your program -if it was compiled with GCC (or with Clang) so using a recent gcc, g++, gfortran, clang, clang++ command - with the -g (or -g2 ....) flag.

Be aware of setrlimit(2). You may need to use the ulimit builtin of GNU bash (see bash(1) and the documentation of GNU bash...), or the limit builtin of zsh to increase the core size file limit.

If your core dump limit size (i.e. RLIMIT_CORE for setrlimit) is too small, it is preferable to raise it and run again your program. A good developer could disable core dumps in an executable. My guess (perhaps wrong) is that a too small core limit size might be consistent with your observations.

If your interactive Unix shell is something else that /bin/bash (e.g. fish) be sure to read its documentation. See also passwd(5), ps(1) -to be used as ps $$, pstree(1), top(1).

See also proc(5). You might try cat /proc/$$/limits or /bin/cat /proc/self/limits in your terminal before running your program there. Perhaps /bin/cat /proc/version could be needed to understand more.

Your Linux kernel can also be configured to avoid core dumps. Ask for details on kernelnewbies and read more about SE Linux. Some Linux kernels accept gzcat /proc/config.gz as root, but other don't, to query their configuration. You could need root access with sudo(8) or su(1). See credentials(7).

On Linux, you might be interested by Ian Taylor libbacktrace. RefPerSys and GCC are using it.

My suggestion is to improve the source code of your application to use syslog(3) to log all the program arguments. If you compile your application with some recent GCC (in 2023), you could pass both -O2 optimization and -g debugging flags to compiler and perhaps even use Ian Lance Taylor libbacktrace. If using the git version control system you can embed version ids in the executable (like RefPerSys is doing).

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 1
    With a previous employer, some of my colleagues ran "pstack" in order to see the stack trace of a core dump. Maybe this might also contain the executable's name? – Dominique Jun 14 '20 at 18:49
  • We have a mechanism in place where we get the stack trace but unfortunately it is not available because of the error which is mention `ss: ELF 64-bit LSB core file x86-64, version 1 (SYSV), too many program header sections (6841)` – kms Jun 18 '20 at 00:37
  • `file` fails miserably for me: my executable is launched with a shell wrapper, with a very long command line. `file` fails to print the meaningful part of it. More answers are welcome. – Victor Sergienko Aug 08 '23 at 22:28
  • Your shell wrapper should be improved to log (using [logger(1)](https://man7.org/linux/man-pages/man1/logger.1.html)....) the full command line. Or your executable source code should store it in some file (perhaps under `/tmp/`) – Basile Starynkevitch Aug 09 '23 at 06:19
  • Actually some (conventional) file under `/var/tmp/` should be a better place and it survives reboot. Alternatively use [syslog(3)](https://man7.org/linux/man-pages/man3/syslog.3.html) in your source code. – Basile Starynkevitch Aug 09 '23 at 06:42
  • A possibility might be some segmentation violation or (less likely) some hardware failure. Be aware of [ASLR](https://en.wikipedia.org/wiki/Address_space_layout_randomization) – Basile Starynkevitch Aug 09 '23 at 06:52