27

I'm planning to write a C# 3.0 compiler in C#. Where can I get the grammar for parser generation?

Preferably one that works with ANTLR v3 without modification.

Mehrdad Afshari
  • 414,610
  • 91
  • 852
  • 789
  • 6
    You are aware that we already ship for free a compiler that compiles C# 3, right? :-) But seriously, why are you building your own? Just for fun, or is there some business purpose? (The reason I ask is because we are very interested in learning what "services" people want out of our compiler other than simply "spit me out some IL for this source code".) – Eric Lippert Oct 13 '09 at 16:55
  • 1
    Eric: Primarily for fun. However, I come up with some language ideas from time to time that I wish I could test. – Mehrdad Afshari Oct 13 '09 at 16:57
  • @Mehrdad, if you do get some code going, can i play around with it :) – Stan R. Oct 13 '09 at 17:07
  • @Stan R: There are plenty of open source C# compilers out there right now. Mono's C# compiler is written in C#, for instance. – Mehrdad Afshari Oct 13 '09 at 17:09
  • Given you dont need modification, you dont really want to write much, do you? – leppie Oct 13 '09 at 17:16
  • @leppie: I'm OK with modification, but I don't want to spend time writing basic stuff for the C# language too much. I prefer to get one working quickly and experiment with it afterwards. I already know how to do parsers. My primary goal is not learning how to parse stuff. I considered messing with Mono C# compiler. I prefer writing my own. – Mehrdad Afshari Oct 13 '09 at 17:21
  • 2
    @Mehrdad: "...language ideas" is pretty vague. If you want to do anything interesting with C# (or really any other language) you not only need a parser (therefore grammar) but you also need to build trees, build symbol tables, analyze symbol usage, ... There's a lot more to this than just the grammar. The Mono framework might be a lot more helpful than you think. – Ira Baxter Nov 02 '09 at 02:37
  • Check out answers to http://stackoverflow.com/questions/358052/c-anltr-grammar – Ira Baxter Nov 02 '09 at 02:40
  • 1
    Ira: Of course I have seen that question. I explicitly mentioned C# 3.0, since I've found a bunch of stuff for 1.0. -- I said "primarily for fun" by the way. For me, it's more fun to write my own and use that to test my stuff rather than try to understand the structure of code done by Mono guys. – Mehrdad Afshari Nov 02 '09 at 03:55
  • @Mehrdad: I've built tools to manipulate languages. They take a lot more work than you might expect. You're making what I see as a classic mistake of "if I just had a parser...". Best of luck. – Ira Baxter Nov 02 '09 at 05:52
  • Ira: Of course. It's not supposed to provide any business value. It's a personal project and it's going to be fun. I'll be doing it in my free time. – Mehrdad Afshari Nov 02 '09 at 08:53

4 Answers4

12

Take a look at C# Language Specification. In the chapter B. Grammar you'll find the grammar.

Michael Damatov
  • 15,253
  • 10
  • 46
  • 71
  • Yeah, of course the spec contains grammar. However, the grammar in that *Word document* is scattered through the whole doc and is unsuitable for parser generation. – Mehrdad Afshari Oct 14 '09 at 09:46
  • 8
    It's not *only* scattered throughout; we have an appendix at the end with the whole thing in one place. You are probably right that it would take some modification to make it work for a parser generator. – Eric Lippert Oct 14 '09 at 14:28
  • 1
    Eric: Oh, didn't notice that section. Thanks for pointing out. – Mehrdad Afshari Oct 14 '09 at 16:27
  • 1
    Micheal: It's less than 40 pages long. When I think about it, it's possible to deal with it and start from scratch. +1 – Mehrdad Afshari Oct 14 '09 at 16:30
7

I ran into ANTLR C# Grammar on CodePlex. It's a relatively new project and uses ANTLR 3.2. It says it supports C# 4.0 and is licensed under the Eclipse Public License (EPL).

I played with it a little. It has a bunch of test files containing expressions. It supports lambdas, unsafe context, ... as you'd naturally expect. It parses a C# file and hands you an abstract syntax tree. You can do whatever you want with it.

Mehrdad Afshari
  • 414,610
  • 91
  • 852
  • 789
3

Are you looking for something like this or this?

Please also refer to C# ANLTR grammar question.

Community
  • 1
  • 1
Rubens Farias
  • 57,174
  • 8
  • 131
  • 162
  • 1
    The linked grammar is not C# 3.0. It doesn't support lambdas. That's specifically important to me. – Mehrdad Afshari Oct 13 '09 at 17:11
  • 2
    It would seem that adding support for lambdas in terms of existing constructs in the grammar is fairly trivial, since you only need to define argument list. This will probably need LL(*), however, since you can parse something like `(a**` and not know if this will end up being an expression like `(a**b)` (i.e. multiply `a` by the result of a dereference of `b`), or a lambda expression `(a** b) =>`, until you hit the `=>`. Since there's no limit on amount of indirection (pointer to pointer to ...), it looks like it it's LL(*) to me. But since ANTLR3 supports opt-in LL(*), it's not a problem. – Pavel Minaev Oct 13 '09 at 19:27
  • @Pavel: It's not just that. It doesn't support generics. I'll probably write my own parser or grammar from scratch if I can't find a reasonably good C# 3.0 grammar. – Mehrdad Afshari Oct 13 '09 at 19:51
3

Take a look at COCO/R it seems that they have the language specification for C# 3.0.

ErvinS
  • 1,116
  • 7
  • 13