0

I am fixing a bug in a parser for DWARF debug information (2nd DWARF version). In the process I made the following strange observation:

A bytestream was created by reading a dll file (created with ada files by GNAT). At the position of a "DW_TAG_structure_type" in debug_info inside this bytestream an additional byte with the value 1 has crept into the byte stream. Thereby all values in the FileInputStream are shifted by 1 byte.

This is how the original DIE in .debug_info looks like:

 <1><3aa824>: Abbrev Number: 129 (DW_TAG_structure_type)
    <3aa826>   DW_AT_byte_size   : 44   
    <3aa827>   DW_AT_decl_file   : 11   
    <3aa828>   DW_AT_decl_line   : 380  
    <3aa82a>   DW_AT_artificial  : 1    
    <3aa82b>   DW_AT_sibling     : <0x3aa888>

This is the corresponding scheme for the DIE in .debug_abbrev:

129      DW_TAG_structure_type    [has children]
    DW_AT_byte_size    DW_FORM_data1
    DW_AT_decl_file    DW_FORM_data1
    DW_AT_decl_line    DW_FORM_data2
    DW_AT_artificial   DW_FORM_flag
    DW_AT_sibling      DW_FORM_ref4
    DW_AT value: 0     DW_FORM value: 0

However, when I display the bytestream at this point, these values are shown:

Abbrev Number  >>Strange Byte<<  DW_AT_byte_size  DW_AT_decl_file
     81               01                2C              0B         ...
    (129)             ??               (44)            (11)

Does anyone know what this "Strange Byte" is all about?

user946822
  • 25
  • 3

1 Answers1

0

Not really familiar with DWARF, but the DWARF 2.0 specification reads (section 7.5.3):

Following the tag encoding is a 1-byte value that determines whether a debugging information entry using this abbreviation has child entries or not. If the value is DW_CHILDREN_yes, the next physically succeeding entry of any debugging information entry using this abbreviation is the first child of the prior entry. If the 1-byte value following the abbreviation’s tag encoding is DW_CHILDREN_no, the next physically succeeding entry of any debugging information entry using this abbreviation is a sibling of the prior entry. [...]

Finally, the child encoding is followed by a series of attribute specifications. [...]

So, could this "strange byte" represent DW_CHILDREN_yes?

I'm also a little bit puzzled by the value 0x81 (129). The specification states that the tag encoding for DW_TAG_structure_type is 0x13 (which should fit in a byte), and the previous quote suggests that the tag encoding is followed by a byte that is not part of the tag encoding itself (if I understand correctly). So I would expect a stream of 0x13 0x01 (encoded tag + has child entries flag).

DeeDee
  • 5,654
  • 7
  • 14
  • (1) Thanks for the help. I think I expressed myself too inaccurately above: the bytestream I showed is from **debug_info**. The structure you described applies to debug_abbrev. It says in the standard about debug_abbrev: `"Each declaration begins with an unsigned LEB128 number representing the abbreviation code itself. It is this code that appears at the beginning of a debugging information entry in the .debug_info [...] The abbreviation code is followed by another unsigned LEB128 number that encodes the entry's tag."` – user946822 Jan 27 '22 at 07:42
  • (2) The corresponding byte sequence for this DIE looks like this in debug_abbrev: `8101 1301 0b0b 3a0b`. Where `0x81 0x01` is the LEB128 "abbreviation code" and `0x13 0x01` represents DW_TAG_structure_type with child elements. But I am now quite confused why 129 is also converted to `0x81 0x01` in debug_abbrev. Does this have something to do with LEB128? – user946822 Jan 27 '22 at 07:46
  • @user946822 Ok, sorry, didn’t catch that. I indeed looked at debug_abbrev. But regarding 129, isn’t it true that LEB128 (129) = LEB128 (`0x81`) = `0x81 0x01`? – DeeDee Jan 27 '22 at 08:03
  • Yes, you're absolutely right. That must have passed me by. Thank you! – user946822 Jan 27 '22 at 13:07