0

This is my first post and I am posting from a phone so please excuse the formatting issues I am sure I will have.

As the title suggests, I am wanting to create a language for a small virtual machine that I have written. Currently my virtual machine is very simplistic and supports about 16 different opcodes. I have slowly been learning how to optimize the VM and add more functionality to it. The biggest feature that I would like to add would be the ability to write programs in a simple C derived language.

I am stuck as to how I would even start writing such a "language". Would I write it as an interpreter that translates into my VMs assembly code?

Any help whatsoever would be awesome. Articles, books, lectures, I am up for anything. I just love learning and so far this has been my largest and, by far, favorite project that I have worked on.

Edit

I hope that I asked in the right area and would be more than willing to provide additional information in the morning if needed.

Seki
  • 11,135
  • 7
  • 46
  • 70
NotAPro
  • 11
  • 1
  • You are basically looking at porting some toolchain to create binaries for your machine. For a compiler part, you just need to create custom backend. As GCC and LLVM are two popular options, you should look there. – dbrank0 May 29 '14 at 08:45
  • I supoose that is the correct word, toolchain. However I was never planning on making it compile to native code. More of write a small, C based language that compiles into my VMs executable format. – NotAPro May 29 '14 at 10:43
  • 2
    Start by reading a book about compilers. – Marco van de Voort May 29 '14 at 12:13
  • Are there any suggestions? I have ordered a copy of "The Dragon Book", but I hesitated buying any...more modern...books. – NotAPro May 29 '14 at 12:59
  • 2
    You *are* making a compiler. The target "native" code happens be your artificial instruction set. – Seva Alekseyev May 30 '14 at 01:01
  • a small C based language compiler is a fairly large project even if you are just putting a back end on gcc or llvm or some other, doesnt matter what the instruction set is. (a bigger project than the vm itself). – old_timer Jul 13 '17 at 21:50

3 Answers3

0

Assuming that your instruction set is sufficiently powerful, you should be able to construct a compiler (code translator) for it. An interpreter is also possible, but if you can compile from C, you unlock a universe of community code.

Per your comment you are right to order the bible of compiler writing: Compilers: Principles, Techniques, and Tools I'd hold off on other books until you have that one mastered; then you'll know where you want to go.

You'll learn about grammars, parsers and all that, but then you'll need to make some practical decisions. One option to avoid most of the work of building a compiler is to merely build a back-end for an existing compiler such as GCC or Clang (via LLVM). By building a back-end you can also potentially compile in one of several language front-ends, such as D; so with one effort you get a compiler for C++, D, and more.

N8allan
  • 2,138
  • 19
  • 32
0

After purchasing/reading a compiler book and understanding fundamentally how a compiler works, I would suggest using LLVM for your task. Specifically, focus on creating a target and not worrying about a the front-end (lexical analysis/parser generator/AST) at all. This way you can support any language that generates LLVM IR.

Start by looking over the LLVM target-independent code generator framework, this is where you will be performing your magic:

http://llvm.org/docs/CodeGenerator.html

The LLVM target-independent code generator is a framework that provides a suite of reusable components for translating the LLVM internal representation to the machine code for a specified target—either in assembly form (suitable for a static compiler) or in binary machine code format (usable for a JIT compiler).

In addition, LLVM is nice enough to guide you through the development of a back-end, the part you would be interested in:

http://llvm.org/docs/WritingAnLLVMBackend.html

It outlines the prerequisite reading required to carry out such a task. This includes code examples, optimization techniques, how to carry out instruction selection and instruction printing (for your VM). Lastly, how to support JITing, if it is desired.

If you download the source-code, there are many complete back-ends that you can branch and modify according to your VM's requirements.

wbennett
  • 2,545
  • 21
  • 12
  • Thank you very much for your input. Thank you for the links as well. I haven't had as much time as I would like to really sit down and learn about compilers and read those articles in full, but from the first few paragraphs, they seem to be excellent resources. – NotAPro Jun 11 '14 at 17:37
0

I should ask, is this a Stack or Register based virtual machine?

Each one has advantages and disadvantages. If you want your language to run as fast as possible, I advise a register machine; if you want it simple, than stack machine.

Like others suggested, you could just have a language compile to LLVM, depends on your needs of course.

Nergal
  • 349
  • 3
  • 14