How close can I get to GADTs in Rust?

Question

As for 2023 GADTs are not officially supported in Rust. However, I wonder how much of their power I can get using other utilities, such as traits and type-members.

To throw in some context and make the question less vague, here are some examples of what I consider "getting closer":

Conditional invalidation of enum constructors
Enforcing trait constraints for type parameters only for selected enum constructors
Learning constraints upon destructing an enum and learning its constructor

By "conditional invalidation" I mean that

The typechecker will not consider a particular constructor if however defined conditions are met
There will be no complaints when I skip such a constructor in pattern matching

So for example

enum AlmostGADT<T> 
where /* magic? */
{
    Flexi,
    NotSoFlexi( /* something here maybe? */) : AlmostGADT< /* or here? */ >
} 

fn main() {
  let x: AlmostGADT<condition_true!()> = AlmostGADT::NotSoFlexi(...);  // Works!

  match x { AlmostGADT::Flexi => todo!() }                             // Patterns not exhaustive

  let y: AlmostGADT<condition_false!()> = AlmostGADT::NotSoFlexi(...); // Type error
  let y: AlmostGADT<condition_false!()> = AlmostGADT::Flexi(...);      // Works !

  match y {
    AlmostGADT::Flexi => todo!(),                                      // Works!
    AlmostGADT::NotSoFlexi(_) => todo!()                               // Type error
  }

  match y { AlmostGADT::Flexi => todo!() }                             // Works!
  
}

The closest I can get is to carry a member with an impossible value, for example

enum AlmostGADT<T> {
    Flexi,
    NotSoFlexi(T)
}

enum Void {}

fn main() {
  let full: AlmostGADT<()> = todo!();
  let partial: AlmostGADT<Void> = todo!();
}

Here it is impossible to create a value of type Void, so NotSoFlexi(_) of type AlmostGADT<Void> cannot be constructed either. Unfortunately, the typechecker will not consider that, because it is still possible to create an expression of that type (eg. todo!()). It will therefore demand considering the impossible case of NotSoFlexi in pattern matching, which requires polluting the code with dodgy and dirty unreachable!().

How would you approach it in different setups?

I know the question is broad, but that's due to my limited knowledge of the Rust's type system. If you think it's too broad, please consider my AlmostGADT example, just don't feel limited to it.

I don't have any particular expectations on how exactly this would be tackled. I don't mind using weird trait patterns, nightly/experimental features, lifetime hacks, etc. I just wonder how much I can stretch the system to achieve something comparable to GADTs.

On nightly you can do [something like this](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=d3704342a2b319d3597e0c054fd48373) using the unstable `never_type` and `exhaustive_patterns` features plus making some lints compile errors. But I don't think this can scale particularly well. Is this out of academic interest? — isaactfa, Jun 24 '23 at 11:05
Maybe relevant from the last TWIR: [Encoding ML-style modules in Rust](https://blog.waleedkhan.name/encoding-ml-style-modules-in-rust/). — isaactfa, Jun 24 '23 at 11:09
It might be worth noting that in languages that implement GADTs, you still sometimes have to help the compiler realize that a certain pattern matching is, indeed, exhaustive. For instance, in OCaml, you can add a branch `_ -> .`. Also, I read some time ago about a WIP feature to avoid having to handle unreachable patterns. Also, if you pattern match a get a value `x: !`, then you should not call `unreachable!()` (which is just a panic with a fancy name) but simply return `x`. See [this](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=2093d2707a5d78b4c55e0d3429822168). — jthulhu, Jun 24 '23 at 11:31

Csongor Kiss · Answer 1 · 2023-07-25T11:49:30.303

I will focus on these two:

Enforcing trait constraints for type parameters only for selected enum constructors

Learning constraints upon destructing an enum and learning its constructor

the first one is much simpler than the second, but not very useful on its own, so we'll try to encode both. Let's start with a simple Haskell definition that we'll then implement in Rust:

{-# LANGUAGE GADTs #-}

class Foo a where
  foo :: a -> Bool

data GADT a where
  SomeFoo :: Foo a => a -> GADT a
  Other :: a -> GADT a


-- example usage
instance Foo Int where
  foo _ = True

doThing :: GADT a -> Bool
doThing (SomeFoo a) = foo a
doThing (Other _) = False

good = doThing (SomeFoo (10 :: Int)) // True
bad = doThing (SomeFoo (10 :: Double)) // type error, no instance for Foo Double

Here, the SomeFoo constructor of the GADT type requires that the argument has a Foo instance, and in return allows us to use foo when pattern matching on it (in doThing). The Haskell runtime implements this by storing a reference to the type class dictionary (https://dl.acm.org/doi/10.1145/75277.75283) in the constructor, in other words via dynamic dispatch.

Rust supports a form of dynamic dispatch via the dyn Trait construct (https://doc.rust-lang.org/std/keyword.dyn.html), but it's severely limited (requires the trait to be "object safe"), so it's a non-solution if you're looking for reasonable parity.

Rust version

The definitions will looks like this:

trait Foo {
    fn foo(&self) -> bool;
}

enum AlmostGADT<T> {
    SomeFoo(CanFoo<T>, T),
    Other(T),
}

The main thing to note is the additional CanFoo<T> field in the SomeFoo constructor. It's a simple newtype wrapper:

struct CanFoo<T> {
    _phantom: PhantomData<T>,
}

Now we turn our attention to the two requirements in order.

Construct

Enforcing trait constraints for type parameters only for selected enum constructors

We can simply define a smart constructor and leave the CanFoo struct's field private, so the only way to construct it outside of the module is by instantiating with a type that implements Foo:

struct CanFoo<T: ?Sized> {
    _phantom: PhantomData<T>,
}

Match

Learning constraints upon destructing an enum and learning its constructor

So far this is not sufficient:

fn do_thing_1<T>(a: &AlmostGADT<T>) -> bool {
    match a {
        AlmostGADT::SomeFoo(_witness, a) => a.foo(), // error: no method `foo` for `&T`
        AlmostGADT::Other(_) => false,
    }
}

The key issue is that, unlike GHC, Rust does not support local assumptions under pattern matches, so it's not possible to introduce a trait bound into a local scope within a match arm. When resolving a trait bound, the compiler will always look at the current function's context, then the global context.

Since it's not possible to introduce local assumptions, we'll have to make do with global definitions.

The trick is to create a wrapper trait which I'll call MaybeFoo:

trait MaybeFoo {
    type Implemented;

    fn maybe_foo(&self, can_foo: &CanFoo<Self>) -> bool;
}

this is identical to Foo, except that it now has an associated type called Implemented. This will either be True or False

struct True;
struct False;

We first define a catch-all implementation for all types

impl<T> MaybeFoo for T {
    default type Implemented = False;

    default fn maybe_foo(&self, _can_foo: &CanFoo<T>) -> bool {
        unreachable!()
    }
}

Notice the default keyword on the two members. This requires #![feature(specialization)] (https://rustc-dev-guide.rust-lang.org/traits/specialization.html#specialization), and indeed makes fundamental use of specialisation semantics, which I'll come back to later. This function won't be callable externally because, again, CanFoo<T> can only be created for types that implement Foo.

We provide another implementation for all types that implement Foo:

impl<T: Foo> MaybeFoo for T {
    type Implemented = True;

    fn maybe_foo(&self, _can_foo: &CanFoo<T>) -> bool {
        T::foo(self)
    }
}

(coming from Haskell, this might look like a duplicate instance, but Rust actually is much more liberal with duplicate instance heads as it will happily consider the trait bounds when building the specialisation graph, so the specificity checker is not as syntactic).

Now we can rewrite do_thing:

fn do_thing_2<T>(a: &AlmostGADT<T>) -> bool {
    match a {
        AlmostGADT::SomeFoo(witness, a) => a.maybe_foo(witness),
        AlmostGADT::Other(_) => false,
    }
}

impl Foo for usize {
    fn foo(&self) -> bool {
        true
    }
}

#[test]
fn test_foo() {
    let arg: usize = 10;
    assert_eq!(do_thing_2(&AlmostGADT::SomeFoo(can_foo(), arg)), true); // OK
}

Why does this work?

Coming from Haskell, do_thing_2 doing the right thing (instead of panicking) might be surprising. That's because in GHC, type class resolution is coupled with evidence generation, so when the compiler decides whether a type is an instance of a type class, it also finds the exact implementation and inserts a reference to it. That algorithm would pick up the default catch-all instance in a.maybe_foo(witness) (since that's the only one that matches in a fully generic context), and insert the panicking implementation into the call site.

However, rust specialisation works differently. In the first pass, the type checker simply decides whether the type satisfies the trait bounds, but this pass is proof-irrelevant, i.e. the actual derivation doesn't matter. The concrete implementation will be picked out at the end during specialisation, where all type variables are monomorphised, and the most specific implementation candidate is selected. So when the typechecker sees do_thing_2, it accepts the definition based on the Implemented = False instance, but when the specialiser sees it again with T = usize, it will pick up the Implemented = True instance. This means that the rust solution relies only on static dispatch. This technique would not work with existential types, but Rust doesn't support them anyway.

How close can I get to GADTs in Rust?

1 Answers1

Rust version

Construct

Match

Why does this work?