23

I want to start working on a little compiler and to build it I am hesitating between several different languages.

My requirements are simple, I want to be able to emit LLVM-IR code cause I have a LLVM backend I would like to reuse to target a specific platform.

So right now I have the following choices :

  1. Use OCaml and the LLVM bindings - Efficient, LLVM ships with the OCaml bindings, but the coding experience with OCaml (IDE, support) is not the best.

  2. Use C/C++ and the LLVM bindings - The most obvious way I would say, but I would like to use a functional language as this topic is new to me and I want to learn something new.

  3. Use F# - I felt in love with this language, but there are no official LLVM bindings. So I guess I could do the same through the use of System.Reflection.Emit. Even though it seems like there is an initiative here for F# binding for LLVM - https://github.com/keithshep/llvm-fs

I would love to get your thoughts on this.

Chris Smith
  • 18,244
  • 13
  • 59
  • 81
Thibault Imbert
  • 433
  • 3
  • 7
  • "this question will likely solicit opinion, debate, arguments, polling, or extended discussion." – Pascal Cuoq Jun 17 '11 at 19:19
  • C++/CLI needs a free compiler, and it supports functional programming, maybe not as well as the languages you mentioned... But you would give a great help to people porting Windows stuff! – dario_ramos Jun 17 '11 at 19:20
  • Maybe to Programmers SE? – Ramon Snir Jun 17 '11 at 19:29
  • 14
    Don't close this just because there's not an easy or correct answer. Informative discussion is helpful. – Daniel Jun 17 '11 at 19:58
  • @Daniel : Discussion is for Programmers SE though; not sure why it was voted 'not constructive' rather than 'off-topic' :-/ – ildjarn Jun 17 '11 at 20:28
  • 2
    @ildjarn: Not to be overly cynical, but sometimes it seems more important to keep the SO database tidy than to help people. This is a totally valid, programming-related, question. A reasonable answer would have been a list of pros/cons of each. – Daniel Jun 17 '11 at 20:31
  • 1
    @Daniel : The problem is that the pros and cons would be subjective, so people would be gaining/losing rep over merely having opinions other people liked/disliked. That doesn't mesh with SO in my opinion. Why can't that discussion be had on Programmers SE? – ildjarn Jun 17 '11 at 20:32
  • 3
    @ildjarn: Granted, some opinions would likely be offered, but objective criteria could be given too. The voting system is useful for sifting those out. – Daniel Jun 17 '11 at 20:37
  • 5
    This is a great question, I would have liked to see some answers. – Dan Jun 17 '11 at 21:12
  • 5
    @ildjarn: "The problem is that the pros and cons would be subjective". That is obviously not true. There are many objective pros and cons. For example, the ML family of languages (that includes both OCaml and F#) was specifically designed for this and, consequently, they have many features that are ideally suited to this such as variant types and pattern matching. Also, targetting CIL instead of LLVM obviously gives you a garbage collector for free. Many more objective answers remain to be given so I am sorry to see this question closed. – J D Jun 18 '11 at 14:25
  • 5
    As only a fraction read anything at Programmer SE, the question would provide more interesting/useful answers here, along with less useful ones, which the voting system would handle. – Robert Jeppesen Jun 18 '11 at 19:12
  • Are you going to work from Windows, Linux or something else? Will you work alone on this project? Do you think you might need other contributors later? Will you need external libraries in addition to LLVM? – Laurent Jun 22 '11 at 17:58

1 Answers1

27

Metaprogramming is a real weak point of C++. Most of your effort will be expended trying to manipulate trees. The core advantage of OCaml and F# in this context is pattern matching over union types (and not functional programming) precisely because this makes it so much easier to manipulate trees. Historically, OCaml and F# come from the ML family of languages and were bred specifically for this application domain.

I used LLVM via its OCaml bindings to write HLVM, which includes both standalone and JIT compilation to native code, multicore-capable garbage collection, foreign function interface, tail call optimization and many other features. The experience was very pleasant. My only advice would be to keep track of which LLVM features are tried-and-tested and which are experimental because you don't want to depend on anything experimental (e.g. the GC support when I wrote HLVM).

You can easily use System.Reflection.Emit to generate CIL from F# but you obviously won't be able to leverage your LLVM backend by doing so, although you do get a garbage collector for free, of course. .NET bindings to LLVM are an option. I am not familiar with the ones you cite but writing bindings to LLVM's C API is relatively straightforward. However, I am not sure how well supported LLVM is on the Windows platform.

Regarding OCaml vs F#, both have advantages and disadvantages but I'd say the overall difference is relatively small in this context. Writing functions to print values of big union types due to the lack of generic printing is tedious in OCaml, although this can be automated using some third-party macros. F# provides generic printing but is missing some useful features such as polymorphic variants and structurally-typed objects.

J D
  • 48,105
  • 13
  • 171
  • 274
  • 1
    It's now 2015 and the problem with writing your own print functions in OCaml is largely gone. One of the solutions is using `Sexplib` library, described in detail here: https://realworldocaml.org/v1/en/html/data-serialization-with-s-expressions.html – Vladimir Keleshev Jan 15 '15 at 10:17