5

I am writing software that does some rather complex static analysis and dynamic tracing of other programs. This program uses a lot of static DWARF information to assist in the tracing, including line/column info from the .debug_line DWARF section. In order for this program to have the accuracy that we need, it must have fine-grained and accurate row and column number information to be populated in the DWARF debugging info. Using clang I can force row and column info to be populated using the -g -Xclang -dwarf-column-info options together.

However, there are some cases where clang does not produce fine-grained enough column information. One particular instance is for for loops. Take the following example program which I will refer to as source01.c:

  1      
  2 int main ()
  3 {    
  4     int number1 = 10, number2 = 20;
  5     for (int i=0; i < 10; ++i) {                                                                                                                           
  6         number1++;
  7         number2++;
  8     }
  9     return 0;
 10 } 

I can compile it like so:

clang -g -Xclang -dwarf-column-info source01.c

Which produces the executable a.out. I then use dwarfdump to inspect how the row/column info has been populated:

dwarfdump a.out > dwarf_info

Taking a look at the .debug_line section, I see all of the row/col pairs that are contained in the debug info this executable:

.debug_line: line number info for a single cu
Source lines (from CU-DIE at .debug_info offset 0x0000000b):

<pc>        [row,col] NS BB ET PE EB IS= DI= uri: "filepath"
NS new statement, BB new basic block, ET end of text sequence
PE prologue end, EB epilogue begin
IA=val ISA number, DI=val discriminator value
0x004004f0  [   3, 0] NS uri: "/xxx/loop_01/source01.c"
0x004004fb  [   4, 5] NS PE
0x00400509  [   5,10] NS
0x0040051d  [   6, 9] NS
0x00400528  [   7, 9] NS
0x00400533  [   5,27] NS
0x00400548  [   9, 5] NS
0x0040054a  [   9, 5] NS ET

As you can see, there is the pair (5,10), which corresponds to int i=0;, and the pair (5,27), which corresponds to ++i. However, I would expect (and need) there to also be the pair (5,19), which would correspond to i < 10, but it isn't there. I have inspected the executable's instructions with objdump, and have confirmed that there are indeed instructions which correspond to the comparison i < 10 (Thus, it has not simply been "optimized away").

Do you have any intuition as to why clang would not populate this info? Or is there a way to force clang to produce more fine-grained column info? It seems like clang should have this capability, because the ASTs that clang generates have extremely fine-grained mappings between itself and the source code row and columns.

Thank you.

bddicken
  • 1,412
  • 1
  • 15
  • 16
  • 1
    As this is going very much unanswered: I recently had a similarly niche question about Clang and decided to attempt looking in the [source code](http://clang.llvm.org/get_started.html). I have no expertise in compilers and I don't even know C++, but I found it surprisingly easy to find what I needed (please don't take this as "I'm clever than you" - I was really intimidated by it to start with). Maybe your problem is harder than mine was, but give it a go! Alternatively, ask on the [mailing list](http://clang.llvm.org/get_involved.html). And don't forget to answer your own question after.. – Brendan Apr 29 '14 at 21:19
  • @Brendan Thanks for the comment! Yes, we (my team and I) have considered digging through the source and making some modifications to get the info what we want. We definitely may end up going that route in the future. I also did post to cfe-users (no-response), but I have not yet tried posting to cfe-dev. – bddicken Apr 30 '14 at 01:01

1 Answers1

0

This isn't really a solution so much as an excuse but...

I believe the first entry (5, 8) includes code for both the initializer and condition statements in the for loop. When I compile a program with a for loop, these two statements end up in a contiguous range of addresses.

It would be nice to force clang to generate a separate entry for each statement, but I can't seem to find anything that would do that.

ccurtsinger
  • 208
  • 1
  • 4