1

If the loops are of the different type then I can easily identify them with the name but if there are multiple same type loops (say 5 while loops), how can I identify what basic block in the LLVM IR corresponds to which loop in the source code?

Manually it is easy to identify as we visit the code and the LLVM IR sequentially but I am looking how we can identify the same programmatically.

Example, I have the below source code in C:

int main()
{
   int count=1;
   while (count <= 4)
   {
        count++;
   }
   while (count > 4)
   {
        count--;
   }
   return 0;
}

when I execute the comand clang -S -emit-llvm fileName.c I got fileName.ll create with the below content:

; ModuleID = 'abc.c'
source_filename = "abc.c"
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc19.0.23026"

; Function Attrs: noinline nounwind uwtable
define i32 @main() #0 {
entry:
  %retval = alloca i32, align 4
  %count = alloca i32, align 4
  store i32 0, i32* %retval, align 4
  store i32 1, i32* %count, align 4
  br label %while.cond

while.cond:                                       ; preds = %while.body, %entry
  %0 = load i32, i32* %count, align 4
  %cmp = icmp sle i32 %0, 4
  br i1 %cmp, label %while.body, label %while.end

while.body:                                       ; preds = %while.cond
  %1 = load i32, i32* %count, align 4
  %inc = add nsw i32 %1, 1
  store i32 %inc, i32* %count, align 4
  br label %while.cond

while.end:                                        ; preds = %while.cond
  br label %while.cond1

while.cond1:                                      ; preds = %while.body3, %while.end
  %2 = load i32, i32* %count, align 4
  %cmp2 = icmp sgt i32 %2, 4
  br i1 %cmp2, label %while.body3, label %while.end4

while.body3:                                      ; preds = %while.cond1
  %3 = load i32, i32* %count, align 4
  %dec = add nsw i32 %3, -1
  store i32 %dec, i32* %count, align 4
  br label %while.cond1

while.end4:                                       ; preds = %while.cond1
  ret i32 0
}

attributes #0 = { noinline nounwind uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"clang version 4.0.0 (tags/RELEASE_400/final)"}

Now there are two basic blocks created for the given source file as while.cond and while.cond1, how can I identify which basic block is for which while loop in the source code?

trent
  • 25,033
  • 7
  • 51
  • 90
Sanjit Kumar Mishra
  • 1,153
  • 13
  • 32

1 Answers1

2

Before I attempt to answer, I just want to note that depending on the selected optimization level or the manually selected pass with opt that information might not be there or might not be as accurate (e.g. because of inlining, cloning, etc).

Now, the way to associate between low-level representations and source code is using debugging information (e.g. with the DWARF format). To produce debugging information you need to use the -g command-line flag during compilation.

For LLVM IR, if you take a look at the Loop API there are relevant calls like getStartLoc. So you could do something like this (e.g. inside the runOn method of a llvm::Function pass):

llvm::SmallVector<llvm::Loop *> workList;
auto &LI = getAnalysis<llvm::LoopInfoWrapperPass>(CurFunc).getLoopInfo();

std::for_each(LI.begin(), LI.end(), [&workList](llvm::Loop *e) { workList.push_back(e); });

for(auto *e : workList) {
  auto line = e->getStartLoc().getLine();
  auto *scope = llvm::dyn_cast<llvm::DIScope>(e->getStartLoc().getScope());
  auto filename = scope->getFilename();

  // do stuff here
}

Moreover, for BasicBlock, you can also use the debug-related methods in Instruction (e.g. getDebugLoc) and combine it with calls to other Loop's methods such as getHeader, etc.

Also, note that there is a getLoopID method that uses an internal unique ID for each loop, but that is not always there and it's subject to the potential elisions I mentioned at the start. Anyhow, if you need to manipulate it, look at examples in LLVM source following the setLoopID method (e.g. in lib/Transforms/Scalar/LoopRotation.cpp).

compor
  • 2,239
  • 1
  • 19
  • 29
  • Can you please help me with the header file that Ihas to include for `getAnalysis()` method – Sanjit Kumar Mishra Dec 28 '17 at 12:05
  • The `getAnalysis` method is inherited by the all the pass classes so it should be readily available. The `LoopInfoWrapperPass` is available from `llvm/Analysis/LoopInfo.h`. I think using `grep` or similar tools on the LLVM source tree helps a lot. – compor Dec 28 '17 at 17:11
  • You might also want to have a look at [this](https://github.com/compor/AnnotateLoops) repo. It annotates loops with a numeric ID. You can adapt accordingly. – compor Jan 02 '18 at 00:18
  • thank you for your reply, I will update here as i implement it. – Sanjit Kumar Mishra Jan 02 '18 at 06:36
  • I am using llvm 4.0.1 and when I am trying to add the above code I am getting an error as "identifier `getAnalysis` is undefined". I tried to resolve but not able to. – Sanjit Kumar Mishra Jan 03 '18 at 04:41
  • If you're using it in a pass (using the `LegacyPassManager`) it should be readily available as this method inherited from the pass class (described [here](http://llvm.org/docs/WritingAnLLVMPass.htm)). I don't have a working LLVM 4.0.1 setup, but I have compiled the earlier provided repo with current trunk (6.0.0). That part of the infrastructure hasn't changed. – compor Jan 03 '18 at 06:37
  • Actually, I am not using it inside a pass, I am reading the basic blocks from the bit code file and processing the same. – Sanjit Kumar Mishra Jan 03 '18 at 07:02
  • 1
    That'd help if it was posted in the question; that is why SO policy requires posting some of your code. Anyhow, in the case of a standalone tool you'll have to generate the `LoopInfo` "manually". For that, you can have a look at the test suite in the `unittests` subdirectory, e.g., the `Analysis/LoopInfoTest.cpp` where inside the `runWithLoopInfo` function, a standalone `LoopInfo` object for the passed `llvm::Function` is created using its constructed `DominatorTree`. I'm not sure if this test is available in 4.0.1, but it is in the current trunk (6.0.1). – compor Jan 03 '18 at 07:18