0

I am trying to make a front-end output providing LLVM IR compatible with LLVM IR for FPGAs (backend goal).

So the problem is that FPGAs can't handle pointer to pointers due to memory allocation issues. However, my front end provides me with a pointer to pointer in LLVM IR. Thus they need to be replaced. How could I do this?

Below you can find a specific example, so I would need to replace all i8** variables/pointers with at least single pointers or others. %buffer_table would be in most applications a multidimensional array. The IR is obtained out of TF XLA by setting the dump_ir environment flag. Please let me know, if you need further information.

; Function Attrs: nounwind
define void @_Z3topv(i8* %retval, i8* noalias %run_options, i8** noalias %buffer_table, i64* noalias %prof_counters) #0 {
entry:
  %0 = getelementptr inbounds i8*, i8** %buffer_table, i64 1
  %1 = load i8*, i8** %0, align 8, !invariant.load !0, !dereferenceable !1, !align !1
  %arg0.1 = bitcast i8* %1 to i32*
  %2 = getelementptr inbounds i8*, i8** %buffer_table, i64 2
  %3 = load i8*, i8** %2, align 8, !invariant.load !0, !dereferenceable !1, !align !1
  %arg1.2 = bitcast i8* %3 to i32*
  %4 = getelementptr inbounds i8*, i8** %buffer_table, i64 0
  %5 = load i8*, i8** %4, align 8, !invariant.load !0, !dereferenceable !1, !align !1
  %multiply.5 = bitcast i8* %5 to i32*
  %6 = load i32, i32* %arg0.1, align 4, !invariant.load !0, !noalias !2
  %7 = load i32, i32* %arg1.2, align 4, !invariant.load !0, !noalias !2
  %8 = mul i32 %6, %7
  store i32 %8, i32* %multiply.5, align 4, !alias.scope !2
  ret void
    }
    
attributes #0 = { nounwind uwtable "denormal-fp-math"="preserve-sign" "no-frame-pointer-elim"="true" "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "fpga.demangled.name"="top" "fpga.top.func"="top" "less-precise-fpmad"="false" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false"

!0 = !{}
!1 = !{i64 4}
!2 = !{!3}
!3 = !{!"buffer: {index:0, offset:0, size:4}", !4}
!4 = !{!"XLA global AA domain"}

I am quite new to LLVM and I need it for my master thesis. Any help would be highly appreciated!

RicDen
  • 1
  • 2
  • Where did you get this buffer_table argument from? What's your source? – SK-logic Aug 04 '21 at 16:05
  • I got it out of TF XLA by setting the dump IR environment flag. I also adapted the question. – RicDen Aug 06 '21 at 08:46
  • I guess you're out of luck than, it's TF XLA memory model choice you cannot really change. As you can see, buffer_table is an array of pointers - see the first 3 geps + loads. For an FPGA accelerator you'd prefer to pre-load everything into a number of linear block ram buffers as a dedicated step and then run your kernel on them. If you're super determined you can in theory infer the memory access patterns from such an IR, but it makes much more sense to do it a couple of levels up, before you emit this IR. – SK-logic Aug 06 '21 at 08:54
  • Thank you for the input! I actually managed to handle my problem for this example and for a single layer MNIST network. It is possible, since I know what my input and output are. The buffer_table represents only input and output, which means I can replace them with an input parameter (e.g. array with image data) and return parameter for output. Now, I am working on bigger networks. Generally, you are absolutely right and if I continue this after my master thesis, I will look into the LLVM IR generation in XLA. – RicDen Aug 12 '21 at 12:43

0 Answers0