4

I am working on a project where I have to define a new processor hardware architecture. I need a compiler to generate assembly code for this target (it has its own instruction set).

Programs for this processor will be written in C.

My idea to do this is to parse the C code and generate an Abstract Syntax Tree (AST), then from the AST generate the assembly.

Certainly I'd like to reuse existing components (no need to rewrite a C parser I hope), but what tools or frameworks may I use to accomplish this task?

Thanks.

Vincenzo Pii
  • 18,961
  • 8
  • 39
  • 49

5 Answers5

6

Take a look at LLVM.

It consists of seperate modules which can be created individually and communicate through an intermediate language. In you're case you'll have to write the assembly back-end and reuse other people's C compiler.

orlp
  • 112,504
  • 36
  • 218
  • 315
  • 1
    And, as a bonus, you also get other language front-ends. – MSalters Nov 10 '11 at 09:09
  • Trying to get any tips or tricks, that's cool anyway here is a massive code base for you to search through and decipher good luck :] –  May 16 '22 at 19:00
2

I think the GNU GCC 4.5.x toolchain is excellent, as it can now have plugins as well. Create a foo.c and have a look at raw tree dumps from gcc:

gcc -fdump-tree-original-raw ./foo.c

Biased opinion

I prefer it over LLVM for porting because it's widely adopted and porting. LLVM puts in an extra level of abstraction that you may not need for your project. However, do study both, there are pros and cons.

More fun stuff

http://dragonegg.llvm.org/

Community
  • 1
  • 1
Ahmed Masud
  • 21,655
  • 3
  • 33
  • 58
1

You should look at LLVM ( http://llvm.org ).

Writing a compiler is far from beeing trivial. I would not suggest doing it from scratch.

LLVM is modular and you will only need to create the assembly backend.

crazyjul
  • 2,519
  • 19
  • 26
0

LLVM is one option. You can also consider writing a gcc backend but it will be much harder given how complex GCC is.

Noufal Ibrahim
  • 71,383
  • 13
  • 135
  • 169
0

Clang + LLVMis one of the options. Alternatively, you can try retargetting lcc or Open64.

lcc is suitable for simple, non-standard architectures with a little hope for proper low-level optimisation. LLVM is the best choice for register machines (but will cause troubles if you need, say, segmented 16-bit memory). Open64 offers pretty much the same level.

Retargeting gcc is also an option, but it will require much more mundane manual labour than the others.

SK-logic
  • 9,605
  • 1
  • 23
  • 35