how to get from line in disassembled .text section in ELF file to the corresponding line in source code

Question

I am trying to create a line coverage program in Python.

The goal is to receive list of PCs from a test running on a device and to get information of which functions, conditions and lines of the FW of said device were covered by the test.

The said device has an ARC processor, if that helps.

I have the ELF binaries and the source code (written in C), but can't share them here (confidential company information).

I have taken the ELF file and managed to get .text section disassembly (which is essentially the assembly code of the program).

I have disassembled the FW of the device, so I have the disassembled FW in this manner:

PC: hex_opcode assembly_command operands

like this:

0x100: 7eff mov a,b

Also, using pyelftools from Eli Bendersky : https://github.com/eliben/pyelftools

I have managed to get the source file and line number of the beginning of the function, so I have managed to map the assembly code of each function to source.

And using High PC and Low PC from the functions, I have managed to link the PCs from test logs to functions.

But now I am stuck trying to map individual assembly lines to their location in C source code.

I know that for this I need to read information in .debug_line in DWARF, but I can't quite understand it.

I have managed to come across this:

https://wiki.osdev.org/DWARF

Where they say: Line += Line base + (Opcode - Opcode base) % Line range

I have all the information except Line and Opcode.

By "Line" do they mean starting line on function? (f.e. if "void func()" is located at file source.c line 5) Previous line?

And by "opcode" is this the major opcode of the assembly command? Or the full assembly command opcode (like 0x7eff in binary representation) Something else? Some other Opcodes from DWARF info?

The calculation, from what I understood, is done in decimal, so the opcode must be converted to decimal.

Thanks in advance for the help.

Vadim

The opcode is dwarf opcode not assembly/machine code. It obviously does not matter what number base you use, there is nothing to convert. There is example in the wiki page you linked and it even points you to the detailed specification. You can make a small non-confidential program to test with. — Jester, Apr 22 '20 at 12:24
Are these the DWARF opcodes: DW_LNE_set_address DW_LNS_advance_line DW_LNS_copy etc. I got those from .debug_line — vadim6385, Apr 22 '20 at 15:13
OK, got it. I saw the standard and extended opcode numbers from DWARF, but can't find where I get special opcode numbers from. Also, "line" means beginning of the function, or the previous line of code — vadim6385, Apr 22 '20 at 15:54
f.e. if the function goes like 1. void main() 2. { 3. printf("hello world"); 4. } Do i start with line 1 each calculation? Or add it to the previous line? — vadim6385, Apr 22 '20 at 15:56

score 1 · Answer 1 · answered Apr 22 '20 at 15:10

1

But now I am stuck trying to map individual assembly lines to their location in C source code.

You want two things:

Build your firmware with debugging info. Usually, you just want to add -g to all existing compile and link lines.

Note: do not remove any optimization flags, or compiled code will no longer match the binary for which you collected coverage.

Note: if your build process runs strip, you'll need to save the binary before stripping it.
Use decode_file_line from dwarf_decode_address.py to map each address (PC) you collected to file,line pair.

answered Apr 22 '20 at 15:10

Employed Russian

199,314
34
295
362

This gets me only to the function (which is explicitly set in the "subprogram" DIE entry. I need to get to specific lines in the code. About building the firmware, I will check, thanks. – vadim6385 Apr 22 '20 at 15:22
nvm, think I got it. lineprog.get_entries() is what I probably need – vadim6385 Apr 22 '20 at 16:43

how to get from line in disassembled .text section in ELF file to the corresponding line in source code

1 Answers1