12

We believe this example exhibits a bug in the C# compiler (do make fun of me if we are wrong). This bug may be well-known: After all, our example is a simple modification of what is described in this blog post.

using System;

namespace GenericConflict
{
  class Base<T, S>
  {
    public virtual int Foo(T t)
    { return 1; }
    public virtual int Foo(S s)
    { return 2; }

    public int CallFooOfT(T t)
    { return Foo(t); }
    public int CallFooOfS(S s)
    { return Foo(s); }
  }

  class Intermediate<T, S> : Base<T, S>
  {
    public override int Foo(T t)
    { return 11; }
  }

  class Conflict : Intermediate<string, string>
  {
    public override int Foo(string t)
    { return 101;  }
  }


  static class Program
  {
    static void Main()
    {
      var conflict = new Conflict();
      Console.WriteLine(conflict.CallFooOfT("Hello mum"));
      Console.WriteLine(conflict.CallFooOfS("Hello mum"));
    }
  }
}

The idea is simply to create a class Base<T, S> with two virtual methods whose signatures will become identical after an 'evil' choice of T and S. The class Conflict overloads only one of the virtual methods, and because of the existence of Intermediate<,>, it should be well-defined which one!

But when the program is run, the output seems to show that the wrong overload was overridden.

When we read Sam Ng's follow-up post we get the expression that that bug was not fixed because they believed a type-load exception would always be thrown. But in our example the code compiles and runs with no errors (just unexpected output).


Addition in 2020: This was corrected in later versions of the C# compiler (Roslyn?). When I asked this question, the output was:

11
101

As of 2020, tio.run gives this output:

101
2
Jeppe Stig Nielsen
  • 60,409
  • 11
  • 110
  • 181
  • 6
    Assume we don't have a compiler handy - what do you expect as the output, and what *is* the output? – Marc Gravell Apr 16 '12 at 16:04
  • Additionally, what version are you running it on, and does it behave differently in any other versions? – Servy Apr 16 '12 at 16:05
  • 2
    Can you elaborate on the statement: "because of the existence of Intermediate<,>, it should be well-defined which one"? – StriplingWarrior Apr 16 '12 at 16:06
  • And how would you expect the compiler to know which of two identical methods you want to override? – Jim Mischel Apr 16 '12 at 16:07
  • How would the compiler know which of the Foo has to be overridden? It could override any of the two methods. There is no reason for overriding one of them and not the other! – JotaBe Apr 16 '12 at 16:08
  • For me, the program returns 11, then 101. .NET 4. – Dmytro Shevchenko Apr 16 '12 at 16:09
  • @lukazoid: intermediate should overrdide the first of them, as it's using the first type parameter T (and not S). The problem is in the Confligt class that has no way to guess what to override as S and T are the same in this case. – JotaBe Apr 16 '12 at 16:10
  • I don't understand why the intermeddiate class makes it obvious which one you wand to override. Just because it overrides one? To me it would seem just as obvious that since the Intermediate class overrides Foo(T t) then my subclass wants to override Foo(S s). At any rate, I don't see how you can expect the compiler to make a reasonable decision because the two signatures are exactly the same. Id just chalk this one up to things you could do but are a very bad idea. – Matthew Sanford Apr 16 '12 at 16:10
  • @JotaBe I get that, what I meant was how does `Intermediate<,>` well define which method gets overloaded in `Conflict`, which seems to be exactly what you're saying. – Lukazoid Apr 16 '12 at 16:11
  • @Lukazoid: intermediate overrides `public override int Foo(T t)` the type T which is the first parameter in both `Base` and `Intermediate` classes. When `Intermediate` inherits `Base` both T and S are well identified, so i't clear which is the overridden method. – JotaBe Apr 16 '12 at 16:19
  • Sorry @JotaBe I've been typing overloads instead of overrides, hence the mixup, it's been a long day... – Lukazoid Apr 16 '12 at 16:24
  • See my last comment to StriplingWarrior's answer for a reference to the C# spec which I believe is relevant. @MarcGravell: Output is 11, 101. Expected output is 101, 2. /JeppeSN – Jeppe Stig Nielsen Apr 16 '12 at 17:09
  • You're operating under the misconception that `Intermediate` overrides `Foo(T)`. But it doesn't. It's overriding `Foo(string)` (or whatever the `T` parameter is. The compiler doesn't know which parameter (`T` or `S`) it's overriding. It just knows what *type*. – Jim Mischel Apr 16 '12 at 18:17
  • @JimMischel: Your analysis is incorrect; the compiler knows that `Intermediate.Foo(T)` overrides `Base.Foo(T)`. If it helps, rename the type parameters: `Intermediate` inherits from `Base`. That type has a method `Foo(U)` and a method `Foo(V)`. Clearly `Intermediate.Foo(U)` overrides `Base.Foo(U)`, not `Base.Foo(V)`! – Eric Lippert Apr 16 '12 at 18:49
  • @EricLippert: I don't see where renaming the parameters helps. If I have `Base` and `Intermediate`, then there is no named mapping between the type parameters. I guess it could be done positionally. Nevertheless, I accept that you know a whole lot more about the compiler than I do, so I stand corrected. I thought the compiler was basing its decisions solely on the expanded types, not the names or positions of the type parameters. – Jim Mischel Apr 16 '12 at 19:09
  • @JimMischel: Of course there is a mapping between the type parameters: **the mapping is provided by the usage of the type parameters of `Intermediate` as type arguments of `Base`.** That's precisely what substitution of arguments for parameters *is*: it is such a mapping. – Eric Lippert Apr 16 '12 at 20:05
  • 2
    @JimMischel: Maybe this will help. Suppose that the declaration was `Intermediate : Base`. In that case, `Intermediate.Foo(U)` overrides the method declared as `Base.Foo(T)`, because *under the substitution* `T-->U, S-->V` given in the base class clause, `Base.Foo(T)-->Base.Foo(U)`. If instead it had been `Intermediate:Base` then `Intermediate.Foo(U)` overrides `Base.Foo(S)` because the mapping is `T-->V, S-->U` in the base class clause. Does that now make sense? – Eric Lippert Apr 16 '12 at 20:10
  • @EricLippert: I see your point. Thanks. – Jim Mischel Apr 16 '12 at 20:59

1 Answers1

21

We believe this example exhibits a bug in the C# compiler.

Let's do what we should always do when exhibiting a compiler bug: carefully contrast the expected and observed behaviours.

The observed behaviour is that the program produces 11 and 101 as the first and second outputs, respectively.

What is the expected behaviour? There are two "virtual slots". The first output should be the result of calling the method in the Foo(T) slot. The second output should be the result of calling the method in the Foo(S) slot.

What goes in those slots?

In an instance of Base<T,S> the return 1 method goes in the Foo(T) slot, and the return 2 method goes in the Foo(S) slot.

In an instance of Intermediate<T,S> the return 11 method goes in the Foo(T) slot and the return 2 method goes in the Foo(S) slot.

Hopefully so far you agree with me.

In an instance of Conflict, there are four possibilities:

  • Possibility one: the return 11 method goes in the Foo(T) slot and the return 101 method goes in the Foo(S) slot.
  • Possibility two: the return 101 method goes in the Foo(T) slot and the return 2 method goes in the Foo(S) slot.
  • Possibility three: the return 101 method goes in both slots.
  • Possibility four: the compiler detects that the program is ambiguous and issues an error.

You expect that one of two things will happen here, based on section 10.6.4 of the specification. Either:

  1. The compiler will determine that the method in Conflict overrides the method in Intermediate<string, string>, because the method in the intermediate class is found first. In this case, possibility two is the correct behaviour. Or:
  2. The compiler will determine that the method in Conflict is ambiguous as to which original declaration it overrides, and therefore possibility four is the correct one.

In neither case is possibility one correct.

It is not 100% clear, I admit, which of these two is correct. My personal feeling is that the more sensible behaviour is to treat an overriding method as a private implementation detail of the intermediate class; the relevant question to my mind is not whether the intermediate class overrides a base class method, but rather whether it declares a method with a matching signature. In that case the correct behaviour would be to pick possibility four.

What the compiler actual does is what you expect: it picks possibility two. Because the intermediate class has a member which matches, we choose it as "the thing to override", regardless of the fact that the method is not declared in the intermediate class. The compiler determines that Intermediate<string, string>.Foo is the method overridden by Conflict.Foo, and emits the code accordingly. It does not produce an error because it judges that the program is not in error.

So if the compiler is correctly analyzing the code, choosing possibility two, and not producing an error, then why at runtime does it appear that the compiler chose possibility one, not possibility two?

Because making a program that causes two methods to unify under generic construction is implementation-defined behaviour for the runtime. The runtime can choose to do anything in this case! It can choose to give a type load error. It can give a verifiability error. It can choose to allow the program but fill in the slots according to some criterion of its own choosing. And in fact the latter is what it does. The runtime takes a look at the program emitted by the C# compiler and decides on its own that possibility one is the correct way to analyze this program.

So, now we have the rather philosophical question of whether or not this is a compiler bug; the compiler is following a reasonable interpretation of the specification, and yet we still do not get the behaviour we expect. In that sense, it very much is a compiler bug. The job of the compiler is to translate a program written in C# into an exactly equivalent program written in IL. The compiler is failing to do so; it is translating a program written in C# into a program written in IL that has implementation-defined behavior, not the behaviour specified by the C# language specification.

As Sam clearly describes in his blog post, we are well aware of this mismatch between what type topologies the C# language endows with specific meanings and what topologies the CLR endows with specific meanings. The C# language is reasonably clear that possibility two is arguably the correct one, but there is no code we can emit that makes the CLR do that because the CLR fundamentally has implementation-defined behaviour any time two methods unify to have the same signature. Our choices are therefore:

  • Do nothing. Allow these crazy, unrealistic programs to continue to have behaviour that does not precisely match the C# specification.
  • Use heuristics. As Sam notes, we could be more clever about using metadata mechanisms to tell the CLR what methods override what other methods. But... those mechanisms use the method signatures to disambiguate ambiguous cases and now we are back in the same boat as we were before; we are now using a mechanism with implementation-defined behaviour in order to disambiguate a program with implementation-defined behaviour! This is a non-starter.
  • Cause the compiler to produce warnings or errors whenever it might be emitting a program whose behaviour is implementation-defined by the runtime.
  • Fix the CLR so that behaviour of type topologies that cause methods to unify in signature is well-defined and matches that of the C# language.

The last choice is extremely expensive. Paying that cost buys us a vanishingly small user benefit, and directly takes budget away from solving realistic problems faced by users writing sensible programs. And in any event, the decision to do that is entirely out of my hands.

We on the C# compiler team have therefore chosen to take a combination of the first and third strategies; sometimes we produce warnings or errors for such situations, and sometimes we do nothing and allow the program to do something strange at runtime.

Since in practice these sorts of programs very rarely arise in realistic line-of-business programming scenarios, I don't feel very bad about these corner cases. If they were cheap and easy to fix then we would fix them, but they're neither cheap nor easy to fix.

If this subject interests you, see my article on yet another way in which causing two methods to unify leads to a warning and implementation-defined behaviour:

http://blogs.msdn.com/b/ericlippert/archive/2006/04/05/odious-ambiguous-overloads-part-one.aspx

http://blogs.msdn.com/b/ericlippert/archive/2006/04/06/odious-ambiguous-overloads-part-two.aspx

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • OK, so C# (section 10.6.4) allows us to write something (your bullet "Possibility two") that there is no chance of translating into correct (implementation-independent) IL, as I understand it. That seems quite unfortunate. But here is a suggestion: Why not have the compiler emit a warning "Now your code is implementation-dependent because you got two identical methods" every time someone constructs a generic type where two methods "unify" (become indistinguishable to the CLR)? What do you think? /JeppeSN – Jeppe Stig Nielsen Apr 16 '12 at 18:53
  • @JeppeStigNielsen: I refer you to the paragraph of my answer which says "*We on the C# compiler team have therefore chosen to take a combination of the first and third strategies; sometimes we produce warnings or errors for such situations, and sometimes we do nothing and allow the program to do something strange at runtime.*" This is one of the situations in which we have chosen to do nothing rather than produce a warning. There is nothing stopping us from producing a warning except for the small fact of having a list of more important work items literally longer than you are tall. – Eric Lippert Apr 16 '12 at 18:56
  • "Our" implementation of CLR seems to choose always the **last** method (in textual order) when deciding what `Conflict.Foo` should match. I verified this even with three methods (so in `Base`). /JeppeSN – Jeppe Stig Nielsen Apr 16 '12 at 19:29
  • @EricLippert, completely off-topic, but is it possible to take a peek at that list of features presumably over 7 feet tall? Just to see what people want from C#. – Roman Royter Apr 16 '12 at 20:19
  • @RomanRoyter: I said work items, not features; work items include bugs, and include things that are not language features, like making analyzers that work well with IntelliSense. The list that contains just possible language features is not longer than you are tall, but is longer than your arm. Neither list is in a form that we could reasonably publish. – Eric Lippert Apr 16 '12 at 20:27
  • 3
    Perhaps there is another option. Amend the C# specification to make this undefined behavior at the C# level. Then the translation to IL with implementation-defined behavior or a compile time error are both acceptable. That change would reflect the reality of the situation. – Kevin Cathcart Apr 16 '12 at 21:28
  • 1
    Also, I believe that the unifying is actually undefined behavior, not Implementation-defined since the CLI spec simply says the class is "invalid", which would mean the implementation is not required to document what it does. Documenting the behavior is part of the definition of implementation-defined (in C++) and implementation-specific (in ECMA CIL). – Kevin Cathcart Apr 16 '12 at 21:30
  • Is the runtime result of unification only undefined in cases where code is not clear about what's being overridden? For example, if `conflict` didn't contain any overrides itself, would the behavior of the rest of the program be well-defined? – supercat Apr 17 '12 at 16:18
  • @supercat: The CLI specification states "Type definitions are invalid if, after substituting base class generic arguments, two methods result in the same name and signature (including return type)." So the moment that `Conflict` inherited from `Intermediate` the program contained an "invalid type definition". You're in the weeds at that point. – Eric Lippert Apr 17 '12 at 16:32
  • @EricLippert: Interesting, since it would seem that other than the method override in `Conflict` there is no ambiguity; `CallFooOfT` would be bound *at compile-time* to the `T` overload, and `CallFooOfS` to the `S` overload. What if one adds an intervening class `Woozle` which inherits `Base`? What would be the validity of `Woozle`, `Woozle`, `Woozle`, or `Woozle where TTT:Animal`? – supercat Apr 17 '12 at 17:31