I will focus on these two:
- Enforcing trait constraints for type parameters only for selected enum constructors
- Learning constraints upon destructing an enum and learning its constructor
the first one is much simpler than the second, but not very useful on its own, so we'll try to encode both. Let's start with a simple Haskell definition that we'll then implement in Rust:
{-# LANGUAGE GADTs #-}
class Foo a where
foo :: a -> Bool
data GADT a where
SomeFoo :: Foo a => a -> GADT a
Other :: a -> GADT a
-- example usage
instance Foo Int where
foo _ = True
doThing :: GADT a -> Bool
doThing (SomeFoo a) = foo a
doThing (Other _) = False
good = doThing (SomeFoo (10 :: Int)) // True
bad = doThing (SomeFoo (10 :: Double)) // type error, no instance for Foo Double
Here, the SomeFoo
constructor of the GADT
type requires that the argument has a Foo
instance, and in return allows us to use foo
when pattern matching on it (in doThing
). The Haskell runtime implements this by storing a reference to the type class dictionary (https://dl.acm.org/doi/10.1145/75277.75283) in the constructor, in other words via dynamic dispatch.
Rust supports a form of dynamic dispatch via the dyn Trait
construct (https://doc.rust-lang.org/std/keyword.dyn.html), but it's severely limited (requires the trait to be "object safe"), so it's a non-solution if you're looking for reasonable parity.
Rust version
The definitions will looks like this:
trait Foo {
fn foo(&self) -> bool;
}
enum AlmostGADT<T> {
SomeFoo(CanFoo<T>, T),
Other(T),
}
The main thing to note is the additional CanFoo<T>
field in the SomeFoo
constructor. It's a simple newtype wrapper:
struct CanFoo<T> {
_phantom: PhantomData<T>,
}
Now we turn our attention to the two requirements in order.
Construct
Enforcing trait constraints for type parameters only for selected enum constructors
We can simply define a smart constructor and leave the CanFoo
struct's field private, so the only way to construct it outside of the module is by instantiating with a type that implements Foo
:
struct CanFoo<T: ?Sized> {
_phantom: PhantomData<T>,
}
Match
- Learning constraints upon destructing an enum and learning its constructor
So far this is not sufficient:
fn do_thing_1<T>(a: &AlmostGADT<T>) -> bool {
match a {
AlmostGADT::SomeFoo(_witness, a) => a.foo(), // error: no method `foo` for `&T`
AlmostGADT::Other(_) => false,
}
}
The key issue is that, unlike GHC, Rust does not support local assumptions under pattern matches, so it's not possible to introduce a trait bound into a local scope within a match arm. When resolving a trait bound, the compiler will always look at the current function's context, then the global context.
Since it's not possible to introduce local assumptions, we'll have to make do with global definitions.
The trick is to create a wrapper trait which I'll call MaybeFoo
:
trait MaybeFoo {
type Implemented;
fn maybe_foo(&self, can_foo: &CanFoo<Self>) -> bool;
}
this is identical to Foo
, except that it now has an associated type called Implemented
. This will either be True
or False
struct True;
struct False;
We first define a catch-all implementation for all types
impl<T> MaybeFoo for T {
default type Implemented = False;
default fn maybe_foo(&self, _can_foo: &CanFoo<T>) -> bool {
unreachable!()
}
}
Notice the default
keyword on the two members. This requires #![feature(specialization)]
(https://rustc-dev-guide.rust-lang.org/traits/specialization.html#specialization), and indeed makes fundamental use of specialisation semantics, which I'll come back to later. This function won't be callable externally because, again, CanFoo<T>
can only be created for types that implement Foo
.
We provide another implementation for all types that implement Foo
:
impl<T: Foo> MaybeFoo for T {
type Implemented = True;
fn maybe_foo(&self, _can_foo: &CanFoo<T>) -> bool {
T::foo(self)
}
}
(coming from Haskell, this might look like a duplicate instance, but Rust actually is much more liberal with duplicate instance heads as it will happily consider the trait bounds when building the specialisation graph, so the specificity checker is not as syntactic).
Now we can rewrite do_thing
:
fn do_thing_2<T>(a: &AlmostGADT<T>) -> bool {
match a {
AlmostGADT::SomeFoo(witness, a) => a.maybe_foo(witness),
AlmostGADT::Other(_) => false,
}
}
impl Foo for usize {
fn foo(&self) -> bool {
true
}
}
#[test]
fn test_foo() {
let arg: usize = 10;
assert_eq!(do_thing_2(&AlmostGADT::SomeFoo(can_foo(), arg)), true); // OK
}
Why does this work?
Coming from Haskell, do_thing_2
doing the right thing (instead of panicking) might be surprising. That's because in GHC, type class resolution is coupled with evidence generation, so when the compiler decides whether a type is an instance of a type class, it also finds the exact implementation and inserts a reference to it. That algorithm would pick up the default catch-all instance in a.maybe_foo(witness)
(since that's the only one that matches in a fully generic context), and insert the panicking implementation into the call site.
However, rust specialisation works differently. In the first pass, the type checker simply decides whether the type satisfies the trait bounds, but this pass is proof-irrelevant, i.e. the actual derivation doesn't matter.
The concrete implementation will be picked out at the end during specialisation, where all type variables are monomorphised, and the most specific implementation candidate is selected. So when the typechecker sees do_thing_2
, it accepts the definition based on the Implemented = False
instance, but when the specialiser sees it again with T = usize
, it will pick up the Implemented = True
instance. This means that the rust solution relies only on static dispatch. This technique would not work with existential types, but Rust doesn't support them anyway.