8

I've always wondered how the dependencies are managed from a programming language to its libraries. Take for example C#. When I was beginning to learn about computing, I would assume (wrongly as it turns out) that the language itself is designed independently of the class libraries that would eventually become available for it. That is, the set of language keywords (such as for, class or throw) plus the syntax and semantics are defined first, and libraries that can be used from the language are developed separately. The specific classes in those libraries, I used to think, should not have any impact on the design of the language.

But that doesn't work, or not all the time. Consider throw. The C# compiler makes sure that the expression following throw resolves to an exception type. Exception is a class in a library, and as such it should not be special at all. It would be a class as any other, except that the C# compiler assigns it that special semantics. That is very good, but my conclusion is that the design of the language does depend on the existence and behaviour of specific elements in the class libraries.

Additionally, I wonder how this dependency is managed. If I were to design a new programming language, what techniques would I use to map the semantics of throw to the very particular class that is Exception?

So my questions are two:

  • Am I correct in thinking that language design is tightly coupled to that of its base class libraries?
  • How are these dependencies managed from within the compiler and run-time? What techniques are used?

Thank you.

EDIT. Thanks to those who pointed out that my second question is very vague. I agree. What I am trying to learn is what kind of references the compiler stores about the types it needs. For example, does it find the types by some kind of unique id? What happens when a new version of the compiler or the class libraries is released? I am aware that this is still pretty vague, and I don't expect a precise, single-paragraph answer; rather, pointers to literature or blog posts are most welcome.

CesarGon
  • 15,099
  • 6
  • 57
  • 85
  • See http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/package-summary.html _Provides classes that are fundamental to the design of the Java programming language._ – flup May 07 '13 at 18:55
  • And http://msdn.microsoft.com/en-us/library/yxcx7skw(v=vs.71).aspx _The System namespace contains fundamental classes and base classes that define commonly-used value and reference data types, events and event handlers, interfaces, attributes, and processing exceptions. Other classes provide services supporting data type conversion, method parameter manipulation, mathematics, remote and local program invocation, application environment management, and supervision of managed and unmanaged applications._ – flup May 07 '13 at 18:56
  • The answer to the first question is "yes". The second question is so impossibly vague that I have no idea how to even begin to answer it. – Eric Lippert May 07 '13 at 19:21
  • @EricLippert: Thanks, and yes, I agree my second question is too vague. I have added some details now. – CesarGon May 07 '13 at 19:52

4 Answers4

11

What I am trying to learn is what kind of references the compiler stores about the types it needs. For example, does it find the types by some kind of unique id?

Obviously the C# compiler maintains an internal database of all the types available to it in both source code and metadata; this is why a compiler is called a "compiler" -- it compiles a collection of data about the sources and libraries.

When the C# compiler needs to, say, check whether an expression that is thrown is derived from or identical to System.Exception it pretends to do a global namespace lookup on System, and then it does a lookup on Exception, finds the class, and then compares the resulting class information to the type that was deduced for the expression.

The compiler team uses this technique because that way it works no matter whether we are compiling your source code and System.Exception is in metadata, or if we are compiling mscorlib itself and System.Exception is in source.

Of course as a performance optimization the compiler actually has a list of "known types" and populates that list early so that it does not have to undergo the expense of doing the lookup every time. As you can imagine, the number of times you'd have to look up the built-in types is extremely large. Once the list is populated then the type information for System.Exception can be just read out of the list without having to do the lookup.

What happens when a new version of the compiler or the class libraries is released?

What happens is: a whole bunch of developers, testers, managers, designers, writers and educators get together and spend a few million man-hours making sure that the compiler and the class libraries all work before they're released.

This question is, again, impossibly vague. What has to happen to make a new compiler release? A lot of work, that's what has to happen.

I am aware that this is still pretty vague, and I don't expect a precise, single-paragraph answer; rather, pointers to literature or blog posts are most welcome.

I write a blog about, among other things, the design of the C# language and its compiler. It's at http://ericlippert.com.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
5

I would assume (perhaps wrongly) that the language itself is designed independently of the class libraries that would eventually become available for it.

Your assumption is, in the case of C#, completely wrong. C# 1.0, the CLR 1.0 and the .NET Framework 1.0 were all designed together. As the language, runtime and framework evolved, the designers of each worked very closely together to ensure that the right resources were allocated so that each could ship new features on time.

I do not understand where your completely false assumption comes from; that sounds like a highly inefficient way to write a high-level language and a great way to miss your deadlines.

I can see writing a language like C, which is basically a more pleasant syntax for assembler, without a library. But how would you possibly write, say, async-await without having the guy designing Task<T> in the room with you? It seems like an exercise in frustration.

Am I correct in thinking that language design is tightly coupled to that of its base class libraries?

In the case of C#, yes, absolutely. There are dozens of types that the C# language assumes are available and as-documented in order to work correctly.

I once spent a very frustrating hour with a developer who was having some completely crazy problem with a foreach loop before I discovered that he had written his own IEnumerable<T> that had slightly different methods than the real IEnumerable<T>. The solution to his problem: don't do that.

How are these dependencies managed from within the compiler and run-time?

I don't know how to even begin to answer this impossibly vague question.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • Thanks. I've added some details to the second question. – CesarGon May 07 '13 at 19:53
  • I think this leaves out the important fact that C# can, in fact, be made to work without the .NET framework and CLR runtime. Mono being the obvious example. In addition, from what I understand you guys (on the C# team or formerly so) went through great lengths to make features like `async`-`await` depend on pattern matching rather than hardcoding class dependencies. – MgSam May 07 '13 at 21:35
  • 1
    @MgSam: Mono depends on a *near exact copy of the .NET Framework and CLR*, so I'm not sure how that's relevant. The point is that the language depends on types in the framework; what organization wrote an implementation of that framework isn't germane. And yes, many features use a "pattern matching" approach for extensibility, but (1) you can't make an async method that returns `MyTask` -- it's got to be the "real" `Task` or a type so similar that the compiler can't tell the difference, and (2) my point is that the TPL team and the C# team worked in concert on the feature. – Eric Lippert May 07 '13 at 21:43
  • I was just trying to point out that your answer seemed to imply C# is .NET-only, which is clearly not the case. With Xamarin you can write applications which are cross-compiled for platforms with no notion of .NET. – MgSam May 10 '13 at 15:33
3

All (practical) programming languages have a minimum number of required functions. For modern "OO" languages, this also includes a minimum number of required types.

If the type is required in the Language Specification, then it is required - regardless of how it is packaged.

Conversely, not all of the BCL is required to have a valid C# implementation. This is because not all of the BCL types are required by the Language Specification. For instance, System.Exception (see #16.2) and NullReferenceException are required, but FileNotFoundException is not required to implement the C# Language.

Note that even though the specification provides minimal definitions for base types (e.g System.String), it does not define the commonly-accepted methods (e.g. String.Replace). That is, almost all of the BCL is outside the scope of the Language Specification1.


.. but my conclusion is that the design of the language does depend on the existence and behaviour of specific elements in the class libraries.

I agree entirely and have included examples (and limits of such definitions) above.

.. If I were to design a new programming language, what techniques would I use to map the semantics of "throw" to the very particular class that is "Exception"?

I would not look primarily at the C# specification, but rather I would look at the Common Language Infrastructure specification. This new language should, for practically reasons, be designed to interoperate with existing CLI/CLR languages, but does not necessarily need to "be C#".

1 The CLI (and associated references) do define the requirements of a minimal BCL. So if it is taken that a valid C# implementation must conform to (or may assume) the CLI then there are many other types to consider that are not mentioned in the C# specification itself.


Unfortunately, I do not have sufficient knowledge of the 2nd (and more interesting) question.

user2246674
  • 7,621
  • 25
  • 28
0

my impression is that

in languages like C# and Ada

application source code is portable

standard library source code is not portable

accross compilers/implementations

user1358
  • 623
  • 2
  • 8
  • 13