26

The following code is causing a little headache for us: clang and MSVC accepts the following code, while GCC rejects it. We believe GCC is right this time, but I wanted to make it sure before filing the bugreports. So, are there any special rules for operator[] lookup that I'm unaware of?

struct X{};
struct Y{};

template<typename T>
struct B
{
    void f(X) { }
    void operator[](X){}
};

template<typename T>
struct C
{
    void f(Y) { }
    void operator[](Y){}
};

template<typename T> struct D : B<T>, C<T> {};

int main()
{
    D<float> d;
    //d.f(X()); //This is erroneous in all compilers
    d[Y()];//this is accepted by clang and MSVC
}

So is the above code is correct in resolving the operator[] call in the main function?

txtechhelp
  • 6,625
  • 1
  • 30
  • 39
u235axe
  • 263
  • 2
  • 5
  • 1
    Note that the call `d.operator[](Y());` behaves exactly analogously to `d.f(...)`, so those compilers are intentionally applying a different rule to operators used as operators, vs member lookup via member-access. – Ben Voigt Feb 12 '16 at 22:49
  • I could be missing something, but I don't see any reason to allow `d[Y()]`. – aschepler Feb 12 '16 at 22:52
  • 1
    I get the same results if `B`, `C`, and `D` are not templates: clang++ accepts and g++ rejects. – aschepler Feb 12 '16 at 22:58
  • For `d[Y()]` I think there's a qualified name lookup for `D::operator []`, whereas in the non-operator case there's an unqualified lookup. But I don't entirely see why that makes a difference. – Alan Stokes Feb 12 '16 at 23:04
  • 1
    I remember looking into this for the case of the function call operator, and I couldn't find any justification for why it should work on Clang. My opinion is that Clang probably has a bug which affects both cases, the `()` and `[]` operators. See http://stackoverflow.com/questions/8083546/why-cant-gcc-disambiguate-multiple-inherited-functions-yet-clang-can – Brian Bi Feb 12 '16 at 23:16
  • Tried `d.operator[](Y())`, g++ reject the code still saying ambiguous, and `d.C::operator[](Y())` is accepted by g++. – Mine Mar 02 '16 at 08:55

2 Answers2

6

It's not 100% clear in which compiler the issue lie. The standard goes over a lot of rules for name lookup (which is what this is an issue with), but more specifically section 13.5.5 covers the operator[] overload:

13.5.5 Subscripting [over.sub]

1 - operator[] shall be a non-static member function with exactly one parameter. It implements the subscripting syntax

postfix-expression [ expr-or-braced-init-list ]

Thus, a subscripting expression x[y] is interpreted as x.operator[](y) for a class object x of type T if T::operator[](T1) exists and if the operator is selected as the best match function by the overload resolution mechanism (13.3.3).

Looking at the standard on Overloading (chapter 13):

13 Overloading [over]

1 - When two or more different declarations are specified for a single name in the same scope, that name is said to be overloaded. By extension, two declarations in the same scope that declare the same name but with different types are called overloaded declarations. Only function and function template declarations can be overloaded; variable and type declarations cannot be overloaded.

2 - When an overloaded function name is used in a call, which overloaded function declaration is being referenced is determined by comparing the types of the arguments at the point of use with the types of the parameters in the overloaded declarations that are visible at the point of use. This function selection process is called overload resolution and is defined in 13.3.

...

13.2 Declaration matching [over.dcl]

1 - Two function declarations of the same name refer to the same function if they are in the same scope and have equivalent parameter declarations (13.1). A function member of a derived class is not in the same scope as a function member of the same name in a base class.

So according to this and section 10.2 on derived classes, since you've declared struct D : B, C, both B and C have member functions for operator[] but different types, thus the operator[] function is overloaded within the scope of D (since there's no using nor is operator[] overridden or hidden directly in D).

Based on this, MSVC and Clang are incorrect in their implementations since d[Y()] should be evaluated to d.operator[](Y()), which would produce an ambiguous name resolution; so the question is why do they accept the syntax of d[Y()] at all?

The only other areas I could see with regards to the subscript ([]) syntax make reference to section 5.2.1 (which states what a subscript expression is) and 13.5.5 (stated above), which means that those compilers are using other rules to further compile the d[Y()] expression.

If we look at name lookup, we see that 3.4.1 Unqualified name lookup paragraph 3 states that

The lookup for an unqualified name used as the postfix-expression of a function call is described in 3.4.2.

Where 3.4.2 states:

3.4.2 Argument-dependent name lookup [basic.lookup.argdep]

1 - When the postfix-expression in a function call (5.2.2) is an unqualified-id, other namespaces not considered during the usual unqualified lookup (3.4.1) may be searched, and in those namespaces, namespace-scope friend function or function template declarations (11.3) not otherwise visible may be found.

2 - For each argument type T in the function call, there is a set of zero or more associated namespaces and a set of zero or more associated classes to be considered. The sets of namespaces and classes is determined entirely by the types of the function arguments (and the namespace of any template template argument). Typedef names and using-declarations used to specify the types do not contribute to this set. The sets of namespaces and classes are determined in the following way:

...

(2.2) - If T is a class type (including unions), its associated classes are: the class itself; the class of which it is a member, if any; and its direct and indirect base classes. Its associated namespaces are the innermost enclosing namespaces of its associated classes. Furthermore, if T is a class template specialization, its associated namespaces and classes also include: the namespaces and classes associated with the types of the template arguments provided for template type parameters (excluding template template parameters); the namespaces of which any template template arguments are members; and the classes of which any member templates used as template template arguments are members. [ Note: Non-type template arguments do not contribute to the set of associated namespaces.—end note ]

Note the emphasis on may.

With the above points and a couple of others from 3.4 (name lookup), one could believe that Clang and MSVC are using these rules to find d[] first (and thus finding it as C::operator[]) vs. using 13.5.5 to turn d[] into d.operator[] and continuing compilation.

It should be noted that bringing the operators of the base classes into scope of the D class or using explicit scope does, however, 'fix' this issue across all three compilers (as is expected based on the using declaration clauses in the references), example:

struct X{};
struct Y{};

template<typename T>
struct B
{
    void f(X) { }
    void operator[](X) {}
};

template<typename T>
struct C
{
    void f(Y) { }
    void operator[](Y) {}
};

template<typename T>
struct D : B<T>, C<T>
{
    using B<T>::operator[];
    using C<T>::operator[];
};

int main()
{
    D<float> d;

    d.B<float>::operator[](X()); // OK
    //d.B<float>::operator[](Y()); // Error

    //d.C<float>::operator[](X()); // Error
    d.C<float>::operator[](Y()); // OK

    d[Y()]; // calls C<T>::operator[](Y)
    return 0;
}

Since the standard is ultimately left to the interpretation of the implementer, I'm not sure which compiler would be technically correct in this instance since MSVC and Clang might be using other rules to compile this though, given the subscripting paragraphs from the standard, I'm inclined to say they are not strictly adhering to the standard as much as GCC is in this instance.

I hope this can add some insight into the problem.

txtechhelp
  • 6,625
  • 1
  • 30
  • 39
  • 2
    But the result of qualified name lookup of `T1::operator@`, that is, `D::operator[]` is an ambiguity error, not an overloaded set. – aschepler Feb 13 '16 at 02:08
  • In the operator case the text you quote says ADL is only allowed when looking for non-members, which these aren't. – Alan Stokes Feb 13 '16 at 10:12
  • I can't see clearly the interplay between what you mention and dependent name lookup. The dependent lookup should be ambiguous in both cases (and that is why I think GCC is right) according to: [isocpp.org nondependent name lookup](https://isocpp.org/wiki/faq/templates#nondependent-name-lookup-members) and [cppreference dependent names](http://en.cppreference.com/w/cpp/language/dependent_name) – u235axe Feb 13 '16 at 12:17
  • @u235axe, I've edited my answer to quote the standard; still not sure which is correct in implementation, but the real crux is why MSVC and Clang accept `d[Y()]` at all (vs. turning it into `d.operator[](Y())`) ... :/ – txtechhelp Feb 17 '16 at 00:50
  • @aschepler, agreed .. I've updated my answer to quote the standard; MSVC and Clang should be turning `d[]` into `d.operator[]`, but it looks like they are following other possible name look up rules to determine what `d[]` resolves to and finding `C::operator[]` as a match .... ? – txtechhelp Feb 17 '16 at 00:56
  • What does ADL have to do with this? We're not doing a function call. There is no non-member `operator[]` – Barry Mar 01 '16 at 14:55
  • @Barry, correct; I was more trying to point to a section in the standard where MSVC and Clang "might" be interpreting extra rules (i.e. as some "helper" extension to the language) to allow `d[Y()]` to parse correctly given the posted code, where as GCC disallows that code because it's following the standard (as seems to be the consensus). Was just trying to give a possible "why" to the noted oddity :) – txtechhelp Mar 03 '16 at 06:17
3

I believe that Clang and MSVC are incorrect, and GCC is correct to reject this code. This is an example of the principle that names in different scopes do not overload with each other. I submitted this to Clang as llvm bug 26850, we'll see if they agree.

There is nothing special about operator[] vs f(). From [over.sub]:

operator[] shall be a non-static member function with exactly one parameter. [...] Thus, a subscripting expression x[y] is interpreted as x.operator[](y) for a class object x of type T if T::operator[](T1) exists and if the operator is selected as the best match function by the overload resolution mechanism

So the rules governing the lookup of d[Y()] are the same as the rules governing d.f(X()). All the compilers were correct to reject the latter, and should have also rejected the former. Moreover, both Clang and MSVC reject

d.operator[](Y());

where both they accept:

d[Y()];

despite the two having identical meaning. There is no non-member operator[], and this is not a function call so there is no argument-dependent lookup either.

What follows is an explanation of why the call should be viewed as ambiguous, despite one of the two inherited member functions seeming like it's a better match.


The rules for member name lookup are defined in [class.member.lookup]. This is already a little difficult to parse, plus it refers to C as the object we're looking up in (which in OP is named D, whereas C is a subobject). We have this notion of lookup set:

The lookup set for f in C, called S(f,C), consists of two component sets: the declaration set, a set of members named f; and the subobject set, a set of subobjects where declarations of these members (possibly including using-declarations) were found. In the declaration set, using-declarations are replaced by the set of designated members that are not hidden or overridden by members of the derived class (7.3.3), and type declarations (including injected-class-names) are replaced by the types they designate.

The declaration set for operator[] in D<float> is empty: there is neither an explicit declaration nor a using-declaration.

Otherwise (i.e., C does not contain a declaration of f or the resulting declaration set is empty), S(f,C) is initially empty. If C has base classes, calculate the lookup set for f in each direct base class subobject Bi, and merge each such lookup set S(f,Bi) in turn into S(f,C).

So we look into B<float> and C<float>.

The following steps define the result of merging lookup set S(f,Bi) into the intermediate S(f,C): — If each of the subobject members of S(f,Bi) is a base class subobject of at least one of the subobject members of S(f,C), or if S(f,Bi) is empty, S(f,C) is unchanged and the merge is complete. Conversely, if each of the subobject members of S(f,C) is a base class subobject of at least one of the subobject members of S(f,Bi), or if S(f,C) is empty, the new S(f,C) is a copy of S(f,Bi).
— Otherwise, if the declaration sets of S(f,Bi) and S(f,C) differ, the merge is ambiguous: the new S(f,C) is a lookup set with an invalid declaration set and the union of the subobject sets. In subsequent merges, an invalid declaration set is considered different from any other.
— Otherwise, the new S(f,C) is a lookup set with the shared set of declarations and the union of the subobject sets. The result of name lookup for f in C is the declaration set of S(f,C). If it is an invalid set, the program is ill-formed. [ Example:

struct A { int x; }; // S(x,A) = { { A::x }, { A } }
struct B { float x; }; // S(x,B) = { { B::x }, { B } }
struct C: public A, public B { }; // S(x,C) = { invalid, { A in C, B in C } }
struct D: public virtual C { }; // S(x,D) = S(x,C)
struct E: public virtual C { char x; }; // S(x,E) = { { E::x }, { E } }
struct F: public D, public E { }; // S(x,F) = S(x,E)
int main() {
    F f;
    f.x = 0; // OK, lookup finds E::x
}

S(x, F) is unambiguous because the A and B base subobjects of D are also base subobjects of E, so S(x,D) is discarded in the first merge step. —end example ]

So here's what happens. First, we try to merge the empty declaration set of operator[] in D<float> with that of B<float>. This gives us the set {operator[](X)}.

Next, we merge that with the declaration set of operator[] in C<float>. This latter declaration set is {operator[](Y)}. These merge sets differ, so the merge is ambiguous. Note that overload resolution is not considered here. We are simply looking up the name.

The fix, by the way, is to add using-declarations to D<T> such that there is no merge step done:

template<typename T> struct D : B<T>, C<T> {
    using B<T>::operator[];
    using C<T>::operator[];
};
Barry
  • 286,269
  • 29
  • 621
  • 977