-2

We are looking for an procedure through which we can easily list down all the file which are compiled together to make an executable.

Use Case : Suppose, We have large repository and we want to know what all are the files existing in repository which are compiled to make an executable (i.e a.out)

For example :

dwarfdump a.out | grep "NS uri"
0x0000064a  [   9, 0] NS uri: "/home/main.c"
0x000006dd  [   2, 0] NS uri: "/home/zzzz.c"
0x000006f1  [   2, 0] NS uri: "/home/yyyy.c"
0x00000705  [   2, 0] NS uri: "/home/xxxx.c"
0x00000719  [   2, 0] NS uri: "/home/wwww.c"

but it doesn't listed down the all the header files. please suggest.

M Gem
  • 1
  • 3
  • 1
    Why do you ask, and what is the use case? Please **edit your question** to improve it and motive it and give more context.... – Basile Starynkevitch Dec 07 '17 at 06:32
  • Even with the edits, we don't understand if you want only file paths or more. What does *location* mean in your edited question? – Basile Starynkevitch Dec 08 '17 at 05:51
  • However, your question shows lack of knowledge about DWARF. Please study that format in more detail – Basile Starynkevitch Dec 08 '17 at 06:00
  • i have modified, hope now you are able to understand. – M Gem Dec 08 '17 at 06:08
  • Even with the latest edit, your question is not clear enough.... I guess that you want `DW_AT_decl_file` DWARF directives (like mentioned in my answer), but you might need more. Again, you need to study DWARF format (so you need to spend a few days reading about it). And you should explain *why* (and *what for*) you want these files. – Basile Starynkevitch Dec 08 '17 at 06:09
  • BTW, your file path list is very naive. What about `stdio.h`? What about the *internal* headers it is referring to? What about the files *implementing* `printf` (e.g. [src/stdio/printf.c](http://git.musl-libc.org/cgit/musl/tree/src/stdio/printf.c) if you use [musl-libc](http://musl-libc.org/))? So even with the edits, your question is not clear enough. – Basile Starynkevitch Dec 08 '17 at 06:14
  • but with the DW_AT_decl_file, doesn't list down the header files. – M Gem Dec 08 '17 at 06:18
  • I think , i have made my question simple. dont know what you are looking for. I dont require pre defined source and header like /usr/include etc. – M Gem Dec 08 '17 at 06:19
  • I improved my answer even more. But your question is still a bit naive when you think of all the details (what about *generated* `.c` files, e.g. with `bison` and `#line` directives). I still don't understand what you really want. Perhaps passing `-H` or some `-M` flag to `gcc` might be enough. – Basile Starynkevitch Dec 08 '17 at 06:22
  • Your use case is too vague. Depending on how your question is understood, it could be trivial (e.g. collect the information given by `gcc -M -H`) or impossible. So I downvoted your question and voted to close it as unclear. And header files often don't contain any code, so you don't have information about them in DWARF. Without much more motivation and context (why do you need to collect all the files, and what exact files do you want) your question is too broad and unclear. – Basile Starynkevitch Dec 08 '17 at 06:35
  • Your question is some [XY problem](http://xyproblem.info/), probably – Basile Starynkevitch Dec 08 '17 at 06:55

1 Answers1

3

How to Extract Source Code From Executable with Debug Symbol Available ?

You cannot do that. I guess you are on Linux/x86-64 (and your question is operating system and ABI specific, and debugging format specific). Of course, you should pass -g (or even -g3) to all the gcc compilation commands for your executable. Without that -g or -g3 option used to compile every translation unit (including perhaps those of shared libraries!) you might not have enough information.

Even with debug information in DWARF format, the ELF executable don't contain source code, but only references to source code (e.g. source file path, position as line and column numbers). So the debug information contains stuff like file src/foo.c, line 34 column 5 (but don't give anything about the content of src/foo.c near that position). Of course once gdb knows the file path src/foo.c it is able to read that source file (if available and up to date w.r.t. executable) so it can list it.

Extracting that debugging meta-data is a different question. Once you have understood DWARF you could use tools like objdump or readelf or addr2line or dwarfdump or libdwarf; and you could also script gdb (recent versions of GDB may be extendable in Python or in Guile) and use it on your ELF executable.

Perhaps you should consider Ian Taylor's libbacktrace. It uses the DWARF information to provide nice looking backtraces at runtime.

BTW, cgdb is (like ddd) only a front-end to gdb which does all the real work of processing that DWARF information. It is free software, you can study its source code.

i have only a.out then i want to list done file names

You might try dwarfdump -i | grep DW_AT_decl_file and you could use some GNU awk command instead of grep. You need to dive into the details of DWARF specifications and you need to understand more about the elf(5) format.

It doesn't listed down the all the header files

This is expected. Most header files don't contain any code, only declarations (e.g. printf is not implemented in <stdio.h> but in some C source file of your C standard library, e.g. in tree/src/stdio/printf.c if you use musl-libc; it is just declared in /usr/include/stdio.h). DWARF (and other debug information formats) are describing the binary code. And some header files get included only to give access to a few preprocessor macros (which get expanded or skipped at preprocessing time).

Maybe you dream of homoiconic programming languages, then try Common Lisp (e.g. with SBCL).

If your question is how to use gdb, then please read the Debugging with GDB manual.

If your question is about decompilers, be aware that it is an impossible task in general (e.g. because of Rice's theorem). BTW, programs inside most Linux distributions are generally free software, so it is quite easy to get the source code (and you could even avoid using proprietary software on Linux).

BTW, you could also do more things at compilation time by passing more flags to gcc. You might pass -H or -M (etc...) to gcc (in addition of -g). You could even consider writing your own GCC plugin to collect the information you want in some database (but that is probably not worth the effort). You could also consider improving your build automation (e.g. adding more into your Makefile) to collect such information. BTW, many large C programs use some metaprogramming techniques by having some .c files perhaps containing #line directives generated by tools (e.g. bison) or scripts, then what kind of file path do you want to keep ??

We are looking for an procedure through which we can easily list down all the files which are compiled together to make an executable.

If you are writing that executable and compiling it from its source code, I would suggest collecting that information at build time. It could be as trivial as passing some -M and/or -H flag to gcc, perhaps into some generated timestamp.c file (see this for inspiration; but your timestamp.c might contain information provided by gcc -M etc...). Your timestamp file might contain git version control metadata (like generated in this Makefile). Read also about reproducible builds and about package managers.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • Thanks for your reply.. But how gdb can extract the source code and shows in a separate windows (gdb -tui a.out) same as cgdb. if you try , you can see the same code is being displayed.. how debugger able to show the real source code. – M Gem Dec 07 '17 at 06:55
  • I improved my answer to explain how. – Basile Starynkevitch Dec 07 '17 at 06:59
  • Okzz... i got your point.. According to you, we will only get the file name , function name, line number etc .. right ? so, how can i extract the source file name and its pathwith the help of gdb ? – M Gem Dec 07 '17 at 07:41
  • I don't understand what you really mean. Please **edit your question** to improve it (don't comment here to improve your question). I recommend to *motivate* your question and give more context. You should add two paragraphs (with motivation and context) into your question, not as comments... – Basile Starynkevitch Dec 07 '17 at 18:39
  • @MGem: please improve your own question by editing it. It was downvoted, so it needs improvement. – Basile Starynkevitch Dec 07 '17 at 18:45