3

Take a look at the following two expressions:

baz(Foo<Bar, Bar>(0))
baz(Foo < Bar, Bar > (0))

Without knowing what, baz, Foo and Bar are (baz can be a type or a method, Foo and Bar can be types or variables), there is no way of disambiguating whether the < represents a type argument list or a less-than operator.

// two different outcomes, difference shown with parentheses
baz((Foo<Bar,Bar>(0)))      // generics
baz((Foo < Bar), (Bar > 0)) // less-than

Any sane programming language should not rely on what baz, Foo and Bar are when parsing an expression like this. Yet, Swift manages to disambiguate the below expression no matter where I place whitespaces:

println(Dictionary<String, String>(0))
println(Dictionary < String, String > (0))

How does the compiler manage this? And, more importantly, is there any place in the Swift Language Spec. where the rules for this are described. Looking through the Language Reference part of the Swift book, I only found this section:

In certain constructs, operators with a leading < or > may be split into two or more tokens. The remainder is treated the same way and may be split again. As a result, there is no need to use whitespace to disambiguate between the closing > characters in constructs like Dictionary<String, Array<Int>>. In this example, the closing > characters are not treated as a single token that may then be misinterpreted as a bit shift >> operator.

What does certain constructs refer to in this context? The actual grammar only contains one production rule that mentions type arguments:

explicit-member-expression → postfix-expression­ . ­identifier­generic-argument-clause­opt

Any explanation or resource would be greatly appreciated.

Clashsoft
  • 11,553
  • 5
  • 40
  • 79
  • 2
    Note that the Swift compiler is Open Source. Here is a starting point with an overview of the architecture and links to the source code: https://swift.org/compiler-stdlib/#compiler-architecture. – Martin R Apr 03 '16 at 17:20

1 Answers1

5

Thanks to @Martin R, I found the relevant part of the compiler source code, which contains a comment that explains how it resolves the ambiguity.

swift/ParseExpr.cpp, line 1533:

///   The generic-args case is ambiguous with an expression involving '<'
///   and '>' operators. The operator expression is favored unless a generic
///   argument list can be successfully parsed, and the closing bracket is
///   followed by one of these tokens:
///     lparen_following rparen lsquare_following rsquare lbrace rbrace
///     period_following comma semicolon

Basically, the compiler attempts to parse a list of types and then checks the token after the closing angle bracket. If that token is

  • a closing parenthesis, bracket or brace,
  • an opening parenthesis, bracket or period without whitespace between itself and the closing angle bracket (>(, >[, but not > (, > [),
  • an opening brace or
  • a comma or semicolon

It parses the expression as a generic call, otherwise it parses it as one or more relational expressions.

As described in the book Annotated C#, the problem is solved in a similar way in C#.

Clashsoft
  • 11,553
  • 5
  • 40
  • 79
  • Your answer helped me appreciate how good your question is. I guess I've seen so many poorly worded questions recently that I did not credit yours with due attention. +1 for both Q and A. Very nice find in the parser comments! The only thing I could add at this point is that it is also fun and instructive to play with the compiler directly from the command line: `swiftc -dump-parse hello.swift` (or `-dump-ast`, `-dump-type-refinement-contexts`, etc.). – Milos Apr 04 '16 at 11:04