Translation to LLVM IR directly or via C/Clang

Question

Let's say someone wants to statically compile a given language using LLVM, what would be the biggest differences (advantages and disadvantages) to translate it first to C and then use CLang instead of dealing with a direct IR translation.

The obvious answer I guess would be that by using a front-end that knows the source language, it is easier to come up with an optimized IR represention with than expecting CLang to perform well with the generated C.

I am missing something here ?

You might also want to look at http://stackoverflow.com/questions/10264635/compiler-output-language-llvm-ir-vs-c/10267898#10267898. — Richard Pennington, Mar 03 '13 at 22:17

score 1 · Answer 1 · answered Mar 01 '13 at 09:25

1

Advantages of using a generic C backend:

You can use any C compiler (not just Clang)
Easier to debug an intermediate code if it's in such a high level language
Depending on your source language semantics, it might be easier to translate it via C (but not necessarily)

And disadvantages are:

If your language is compiled incrementally (e.g., no clearly separated modules, or complex macro system, or whatever else), compiling via LLVM IR in a single module with immediate JIT-compilation makes more sense than generating hundreds of tiny C modules. In other words, C is enforcing separate compilation.
If your source language semantics is too far from C, compiling it straight into a lower level can be easier.
Not all the LLVM functionality is directly accessible from C. E.g., intrinsics, alternative calling conventions, debug metadata for a higher level language.
Clang is big, excluding it will improve your memory footprint
Clang is not easy to maintain, it depends on presence and exact locations of the headers, depends on some parts of gcc, etc. Without it, bare LLVM can be used on its own and dependencies may be kept self-contained.

Optimisations in most cases are not an issue. Clang is generating an extremely non-optimal LLVM IR, deliberately. LLVM should care for all the optimisations, not the frontends. Unless, of course, you can do some high level optimisations, but then they won't depend on your backend choice.

answered Mar 01 '13 at 09:25

SK-logic

9,605
1
23
35

Clang depends on parts of gcc? That's new to me. Care to elaborate which parts you're talking about? – Mar 01 '13 at 15:42
@delnan, `Clang` wants `crtbegin.o`, `crtend.o` and alike from `gcc`, as well as some of the headers. – SK-logic Mar 01 '13 at 16:43
LLVM takes up space in your compiler if you go directly to generating IR, or it (or its equivalent) takes up space in your C compiler. Your call. – vonbrand Mar 01 '13 at 16:47
@delnan, look at the dependencies of the `clang` poackage in your Linux system, it cites `gcc` and some of its internal libraries here (Fedora 18). But I'd also look at MacOS, Apple is allergic to GPL, so presumably there is a GCC-less way to build it. – vonbrand Mar 01 '13 at 16:51
1

@SK-logic, Yeah, I'm aware that Clang dependencies are a problem, but I was thinking about more abstract issues like some you gave in your answer. Do you have a simple example in mind for a language feature that would be hard to translate in C but somewhat easier in LLVM IR ? – Rhangaun Mar 01 '13 at 19:48
another big advantage: you get C interop for free and coding that ain't easy: https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/TargetInfo.cpp – Jun 10 '15 at 23:00

Translation to LLVM IR directly or via C/Clang

1 Answers1