Reading DIEs in ELF file

Question

Hello I fairly new to the DWARF standard and ELF format. I have a few questions. I am using the DWARF 2 standard and I have a pretty basic understanding of how DIEs work and I was needing more clarity on how they are represented in bytes.

ELF Wiki provides a good table for in which order the bytes go in the program header, sections, and segments. But what is the correct way to represent DIEs in bytes for the DWARF 2 standard?

I have tried to dive deep into Dwarf Standards pdf documents to try to understand how DIEs are represented in bytes. Perhaps there is a section I am missing?

I would like to use this information to be able to delete certain DIEs to save space in the debugging section. I am only interest in the DIEs that provide variable address's.

score 1 · Answer 1 · answered Aug 24 '18 at 22:23

I recommend that anyone starting out in DWARF begin with the Introduction to the DWARF Debugging Format. It's a very concise overview that provides an excellent foundation for exploring the format in further depth. Armed with this background, compile a debug version of a very simple program and compare a hex dump of the two ELF sections .debug_abbrev and .debug_info with the output of dwarfdump or readelf.

Once you are broadly familiar with the encoding of a DIE you will see that simply deleting its corresponding bytes from .debug_info would corrupt the entire file — in terms of both DWARF and ELF. For example, each DIE is identified by its relative file offset; deleting one DIE's bytes would alter the offsets of all subsequent DIEs and any references to them would therefore be broken. A robust solution would require parsing the DWARF to create an internal representation of the tree before eliminating unwanted nodes and writing out new DWARF. After modifying .debug_info you'd then need to edit the fabric of the ELF itself: at the very least, this would involve updating the section header table to reflect the new offsets for any shifted sections and updating any relocations.

If your principal concern is indeed space saving then I suggest you instead investigate what compiler options you have. The Oracle Studio Compilers, for example, allow fine control over the content included in the DWARF. Depending on your compiler and OS it may also be possible to emit files with compressed DWARF sections (e.g. .zdebug_info) or even leave the DWARF in different files altogether. The problem of DWARF bloat is well known and, if you are interested in tackling it at a low level yourself, you will find other suggestions in Michael Eager's introduction and in later versions of the standard.

Weird... at the time I wrote it, it didn't, but now it does. I guess internet happens ¯\_(ツ)_/¯ — code_dredd, Jan 05 '20 at 08:03

score 0 · Answer 2 · answered Aug 21 '18 at 19:07

The format is explained page 66 in sections 7.5.2 and 7.5.3.

The example in appendix 2, page 93 is is much clearer:

Each DIE references a corresponding entry in .debug_abbrev which defines a given DIE "signature" i.e.

its type (DW_TAG_*)
it has child DIE
its attribute (DW_AT_*) and their form (DW_FORM_*).

The format od the DIE is:

reference to a abbreviation (LEB128 i.e. variable length);
0 is used for ending a list of children̄;
une value per attribute (using the encoding associated with the given form).

Reading DIEs in ELF file

2 Answers2