TL;DR
CDT ELF parser and ObjDump cannot parse my relocatable file, but both work properly with Executable File. I wonder if they are related.
ObjDump has wrong abbrev offsets, CDT parser throws BufferUnderflowException
The Problem
I'm working with Eclipse CDT ELF Parser in order to extract variable informations from my output files. With my exectuble files, it works perfectly. But with a relocatable file, it has problems to read it, but only after a certain point.
Edit: It seems the buffer underflow is not really the problem, but a consequence of data misinterpretation. Still testing to see how it is processing the file. I'll post more later
Analysis
At first I thought the dwarf section must be corrupted. Using ObjDump, the output file is incomplete with several errors. It actually only shows the types, but with wrong abbreviation numbers and offsets.
The log message:
$ C:\MinGW\bin\objdump.exe --dwarf %MYFILE% > %MYFILE%_objdump.txt
C:\MinGW\bin\objdump.exe: Warning: DIE at offset 0x1a9 refers to abbreviation number 8020 which does not exist
C:\MinGW\bin\objdump.exe: Warning: DIE at offset 0x1a9 refers to abbreviation number 8020 which does not exist
C:\MinGW\bin\objdump.exe: Warning: Unable to load/parse the .debug_info section, so cannot interpret the .debug_loc section.
C:\MinGW\bin\objdump.exe: Warning: Unable to load/parse the .debug_info section, so cannot interpret the .debug_ranges section.
Using readelf, however, it can read it perfectly, with all the sections and symbols. Regarding the other binaries, the readelf and the objdump outputs are the same.
In the debug_info with objdump I have for example:
Compilation Unit @ offset 0x170:
Length: 0x7dc (32-bit)
Version: 3
Abbrev Offset: 0x0
Pointer Size: 4
<0><17b>: Abbrev Number: 1 (DW_TAG_compile_unit)
<17c> DW_AT_producer : (indirect string, offset: 0x0): object
<180> DW_AT_language : 4 (C++)
<181> DW_AT_name : (indirect string, offset: 0x0): object
<185> DW_AT_comp_dir : (indirect string, offset: 0x0): object
<189> DW_AT_low_pc : 0x0
<18d> DW_AT_high_pc : 0x0
<191> DW_AT_stmt_list : 0x0
<1><195>: Abbrev Number: 2 (DW_TAG_subprogram)
<196> DW_AT_name : (indirect string, offset: 0x704): OK
<19a> DW_AT_decl_file : 0
<19b> DW_AT_decl_line : 0
<19c> DW_AT_low_pc : 0x69050403
<1a0> DW_AT_high_pc : 0x400746e
<1a4> DW_AT_frame_base : 0 byte block: ()
<1a5> DW_AT_sibling : <0x2000170>
<2><1a9>: Abbrev Number: 8020
while with readelf
Compilation Unit @ offset 0x170:
Length: 0x7dc (32-bit)
Version: 3
Abbrev Offset: 0xdb
Pointer Size: 4
<0><17b>: Abbrev Number: 1 (DW_TAG_compile_unit)
<17c> DW_AT_producer : (indirect string, offset: 0x3c1): GNU C++ 4.8.1 -mlittle-endian -march=armv7-a -mfpu=vfp -mfloat-abi=softfp -mapcs-frame -mlong-calls -gdwarf-3 -ansi -fno-zero-initialized-in-bss
<180> DW_AT_language : 4 (C++)
<181> DW_AT_name : (indirect string, offset: 0x192): C:/[...]
<185> DW_AT_comp_dir : (indirect string, offset: 0x452): C:\[...]
<189> DW_AT_low_pc : 0x50
<18d> DW_AT_high_pc : 0x204
<191> DW_AT_stmt_list : 0x4d
<1><195>: Abbrev Number: 2 (DW_TAG_base_type)
<196> DW_AT_byte_size : 4
<197> DW_AT_encoding : 7 (unsigned)
<198> DW_AT_name : (indirect string, offset: 0x2a0): long unsigned int
<1><19c>: Abbrev Number: 3 (DW_TAG_base_type)
In the second compilation unit, ObjDump didn't change the abbrev Offset and therefore it is using the abbrev numbers from the first compilation unity. It means it is interpreting the next values as attribues of a different type - where it should be a base type, it reads as a subprogram.
Another important fact is that this firmware has dynamic allocation, unlike all the other ones that worked.
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: ARM
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 18150404 (bytes into file)
Flags: 0x5000000, Version5 EABI
Size of this header: 52 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 40 (bytes)
Number of section headers: 1636
Section header string table index: 1633
I know "objdump sees an ELF file through a BFD filter" while "the readelf program does not link against the BFD library". However, I couldn't find any relation between this and CDT ELF parser, though the problems points in this direction.
Questions
Well, it's a very dense problem with one thing linked to another, so I have a lot of questions. But feel free to answer only one, or half. Or even give me ideas about what to test.
- What and why is it happening?
- Can't ObjDump parse relocations(if the problem is there)?
- Does ObjDump and CDT have libraries in common (from gcc maybe)?
- How the readelf parses it's file? Is it possible to replicate it?
Edit
Because it was only with one of my elf files, and a very complex firmware, it is difficult to replicate it into a Minimal, Complete, and Verifiable example. But if you have an idea how I can give more information, please tell me.