13

In C#, identifiers such as int or string are actually language level keywords.
What is the reason for that?

Note that if the authors wanted to disallow user types with these names, that could have made that a semantic error, not syntax error.

Some clarifications based on answers:

  1. They are keywords because it makes parsing possible/easier
    I do not see why, as I am developing a parser, and having Type.Rule = Identifier is much simpler than Type.Rule = Identifier | "int" | "string" | ....

  2. They are keywords because they are special aliases
    var and dynamic are special things as well, but not keywords (for compatibility reasons, nevertheless it demonstrates that being a keyword is not necessary to be special). In a different example, applying [Serializable] to a type produces magic IL metadata modifier serializable instead of standard custom attribute. But it is still not a keyword.

  3. They are keywords because they were keywords in other languages
    Good answer, but then, why are they keywords in other languages? Also, it is certainly possible to highlight them in blue without them being keywords, so why bring that in from other languages?

Andrey Shchekin
  • 21,101
  • 19
  • 94
  • 162
  • 3
    This is the case for many similar languages (C, C++, Java to name just a few). Why does it matter? – Oliver Charlesworth Jul 21 '12 at 10:18
  • 1
    @GSerg: That's not really related. It explains how to work around the language level keywords, but does not answer the current question asking for a rationale behind these keywords. – O. R. Mapper Jul 21 '12 at 10:29
  • 1
    As @OliCharlesworth says, it's very common for languages to be designed in this way... However, I think it's a very interesting question. +1 :) – jalf Jul 21 '12 at 10:31
  • I always get flak for this so just a comment. Fifty years from now, is C# still going to be relevant? Int32 will *always* be 32 bits. But will `int` still be 32-bits when everybody has a 256-bit cpu with a petabyte of ram in their phone? The C language is 34 years old, developed on an 18-bit machine. It survived by not nailing down the size of an int. – Hans Passant Jul 21 '12 at 12:15
  • 1
    @HansPassant Yes. I think C#'s `int` will still be 32 bits fifty years from now, `long` will still be 64 bits. Then 128-bit is `xlong` (eXtra long), 256-bit `vxlong` (Very eXtra Long) ツ – Michael Buen Jul 21 '12 at 12:44
  • 1
    Actually, `var` and `dynamic` are _contextual_ keywords, `string` and `int` are _reserved_ keywords. The main difference between _contextual_ and _reserved_ keywords is that the latter cannot be used as identifiers. [Reserved and Contextual Keywords](http://blogs.msdn.com/b/ericlippert/archive/2009/05/11/reserved-and-contextual-keywords.aspx) – Paolo Moretti Jul 21 '12 at 17:23

3 Answers3

4

As far as I've read, the C# designers wanted to allow the typical type names as they are usual in C-style languages, namely string, int, and so on. At the same time, each of these types has to have a full-qualified type names such as System.String and System.Int32. Therefore, it was decided to provide aliases for these frequently used types.

If I find the source for this statement again, I'll add the link.

In other CLI-based languages, the same full-qualified type identifiers are valid. However, type names such as int or string might not be usual in such languages, so other aliases might be provided there.

One possible advantage of using the type alias may be improved readability, which is why there is a StyleCop rule that enforces use of the alias over the regular type name. The point about brevity is also mentioned in this thread on the same topic.

O. R. Mapper
  • 20,083
  • 9
  • 69
  • 114
  • So you're saying that the reason these aliases are *keywords* is simply because they wanted to stick close to other C-style languages? The rest of it seems irrelevant, since he isn't asking why these aliases *exist*, but merely why they're defined as language keywords – jalf Jul 21 '12 at 10:56
  • Yes, as jalf said, I see why having alias is useful, but why have them as keywords? – Andrey Shchekin Jul 21 '12 at 10:58
  • @AndreyShchekin: Why *not* have them as keywords? – Oliver Charlesworth Jul 21 '12 at 11:08
  • @jalf: Oh, I see, I didn't realize there could be a difference between these two questions. After all, anything that is not an identifier (which is always associated with a namespace and might get rather lengthy) and not a literal is automatically a keyword. – O. R. Mapper Jul 21 '12 at 11:13
  • 2
    @AndreyShchekin: What else could an alias be? If they are normal identifiers, they are bound to a namespace again, thereby losing some of the advantages of having the aliases in the first place. Hence, they are keywords. – O. R. Mapper Jul 21 '12 at 11:15
  • @O.R.Mapper Why couldn't you have aliases that are not keywords but also aren't bound to any namespace (or are in the global namespace)? – svick Jul 21 '12 at 11:17
  • @svick: Because such a kind of non-literal non-comment token is not part of the C# language. Either it's a normal identifier, then it belongs to a namespace. Or it does not belong to a namespace, then it's a keyword. – O. R. Mapper Jul 21 '12 at 11:18
  • @O.R.Mapper But that's circular reasoning. The question is “why was C# designed this way?” You can't answer that with “because C# was designed this way.” – svick Jul 21 '12 at 11:19
  • @svick: No, the question is "Why are the aliases in C# designed this way?", and the answer is, to some extent, "Because C# doesn't allow aliases to be designed any other way." – O. R. Mapper Jul 21 '12 at 11:20
  • @svick: As for identifiers in the global namespace, that might be a solution; however, note that there are no other non-keyword identifiers available only in a particular language and not in other CLI languages. That would probably involve some compiler magic, hence it was supposedly deemed more useful to define these language-specific strings just like other language-specific strings, namely as keywords. – O. R. Mapper Jul 21 '12 at 11:22
  • @O.R.Mapper: well `var`, for example, is not a keyword in the same sense as `int`. You can have `var var = 3`, but not `int int = 3`. And `var` is not even a type. – Andrey Shchekin Jul 21 '12 at 11:40
  • @AndreyShchekin: Ah, it seems like such *contextual keywords* (that are not generally reserved) where not introduced until C# 2.0, and therefore only after `int` and the like had already been defined as non-contextual keywords. So, at the time `int` etc. were defined, C# did indeed not offer any other options. – O. R. Mapper Jul 21 '12 at 11:44
  • @O.R.Mapper: ok, maybe it is not a right example, but actually, from the point of the parser, they do not have to be keywords at all. They can be special, have some magic applied, and still not be keywords. For example, applying `[Serializable]` at a type produces magic IL metadata bit `serializable` instead of working as all other attributes. But it is still not a keyword. – Andrey Shchekin Jul 21 '12 at 11:51
  • @AndreyShchekin: If the are special (whatever that means) in a way that they can be used everywhere (irrespective of namespace visibility) and cannot be redefined (at least in some contexts), they essentially behave like keywords, so what would be the point in not calling them keywords? – O. R. Mapper Jul 21 '12 at 12:20
  • @AndreyShchekin: Right, `SerializableAttribute` is not a keyword, but an identifier, and it behaves exactly like one: It is only available when its namespace is included, it is defined in an assembly and thus available in all languages that use that assembly etc. - whether or not the compiler performs any magic when encountering that identifier does not change its being an ordinary identifier, code-wise. – O. R. Mapper Jul 21 '12 at 12:20
  • 1
    @O.R.Mapper: the principal difference between "calling them keywords" and them _being_ keywords is the way grammar is defined. It is not just a question of terminology, it has very specific effect on parser behavior. – Andrey Shchekin Jul 21 '12 at 15:32
  • @AndreyShchekin: I was seeing this from a user perspective rather than from the perspective of compiler internals. Reading your updated question, I realize you are rather looking for the latter, sorry. – O. R. Mapper Jul 21 '12 at 18:16
3

Almost language I have learnt like this : C, C++, Java, Pascal, C#. I don't really sure, but according to class Compiler Design I have learnt at University (At this course, we will learnt how people wrote compiler, step by step, and implement an own compiler), the main reason for your question is : for easier at Lexical Analysis phrase

When you code, all of code just AN simple string. and Compiler must do many phrase before really compile it.

For example, when you type :

int a = 5;

The first phrase is Lexical Analysis must give a dictionary like below and send to Parser pharse:

    int ---> identifiers
    a   ---> Variable
    =   ---> Operator(=)
    5   ---> Integer
    ;   ---> ;

And how Lexical Analysis know this : first, It will build a table of dictionary and search for string you input. When the first tokenizer meet, it will STOP and take that tokenizer. (It's important ! )

the dictionary like this :

if   ---> if
then ---> then
int  ---> int
.... // all keywords here
[a-z][a-z0-9_]* ---> variable  // example of regular expression :  I don't sure it's true, just something like this :D

So, if the language allow you named int just like an variable. such as :

int int = 5;

The above method for lexical analysis is broken, when it reads second int, it doesn't know it's a variable or keyword, and must have more complicate steps to determine it.

I don't say it cannot, but it's more complicate and slower when compile and doesn't need to. Just say simple to programmer : "Hey, DON'T DO THAT, or, I will not compile your program :)) "

Hope this helpful :)

hqt
  • 29,632
  • 51
  • 171
  • 250
  • 4
    But what makes `int` special, when compared with, say, `Int32`? For example `Int32 Int32 = 5;` compiles just fine. *Why* are the type aliases keywords that can't be used as identifiers? – svick Jul 21 '12 at 11:12
  • @svick In fact, `Int32` acts as an type of object that is `System.Int32`. So, if `Int32 Int32 = 5; ` error, so this line should be error too : `Cat cat = new Cat();` – hqt Jul 21 '12 at 11:22
  • @svick `Int32` is the name of a class in the `System` namespace. The only rules for identifiers are pretty much that they can only contain certain characters and start with a letter or underscore. So `ClassName ClassName = new ClassName();` should compile. – Louis Waweru Jul 21 '12 at 11:23
  • I don't know about ISO Pascal, but in Delphi's Object Pascal, the built-in types aren't keywords (except for string types, which use a special syntax), and there is no problem whatsoever with that. –  Jul 21 '12 at 11:26
  • 1
    I updated the question to cover this. If I am making a parser, I would define `SimpleTypeReference` as `Identifier` and it will be much simpler then defining it as `Identifier | BuiltInTypeName`, since I can postpone this differentiation to reference resolution phase, which is already complex anyway. – Andrey Shchekin Jul 21 '12 at 15:36
  • @AndreyShchekin Sorry, I don't understand your idea so much. but after `Lexical Analysis` is `Parser`. At `Parser`, you help compiler know "language" rather than just know "vocabulary". for example : at this phrase, you must make compiler understand that : `a + b = c` : valid. `a + b = if` invalid – hqt Jul 21 '12 at 16:09
1

This is because int and other special value types that are built into the C# language are not really types at all but aliases of the .NET Framework System types.

The reason this is a syntax error and not a semantic error is simply because the errors are detected in the syntactical error detection phase, which happens before the semantical one. The syntactical error detection has all the information it needs to determine if int is used as a type or as something else. Let's say we have the following rule:

declaration = type identifier ;

The syntactical phase checks if identifier is [a-Z]([a-Z]+ | [0-9]+)+ and is not a reserved keyword or alias, which in the case you described it is. So it makes total sense to name this a syntax error.

Roy T.
  • 9,429
  • 2
  • 48
  • 70
  • 2
    I don't think that really answers the question. In many other languages, built-in types are *also* keywords, but they're not aliases of .NET system types. And for the second part, you're really just restating the question: "it's a syntax error because it's detected in the syntax error detection phase". Doesn't explain *why* it was designed in this way. :) – jalf Jul 21 '12 at 10:29
  • 1
    but circular reasoning works! http://www.resistanceisfruitful.com/blog/wp-content/uploads/2010/02/circular-reasoning1.jpg?d9c344 – Mare Infinitus Jul 21 '12 at 11:06
  • Hmm, now I've read it again it seems a bit circular indeed, woops, what I meant was that it's common for a syntactical analyzer to have a rule like the above and that it can't distinguish between the error `type type` and `type identifier` if you do not check for aliases and keywords. (Edit: curious that someone with the name Infinitus is arguing for circular reasoning ;) ) – Roy T. Jul 21 '12 at 12:42