4

When I define a new type using GADT-Syntax, what exactly makes the difference between a "normal" algebraic data type and a generalized algebraic data type? I think it has to do with the type signatures of the defined data constructors but could never find an exact definition.

Furthermore, what are consequences of this difference that justify having to enable GADTs explicitly? I read that they make type inference indecidable, but why cant we treat the type constructors as functions with that type signature and plug it into the inference algorithm?

duplode
  • 33,731
  • 7
  • 79
  • 150
pgmcr
  • 79
  • 7

2 Answers2

7

Pattern-matching on a normal algebraic datatype (regardless of whether it's defined in GADT syntax) only injects value information into the body. Pattern-matching on a GADT can also inject type information.

For example,

data NonGADT a where
  Agnostic :: a -> NonGADT a

data GADT a where
  IntSpecific :: Int -> GADT Int
  CharSpecific :: Char -> GADT Char

All these constructors have in a sense the same signature a -> □ADT a, so a priori that's what the compiler would need to work with when typechecking a function involving pattern matches on them.

extractNonGADT :: NonGADT a -> a
extractNonGADT v = case v of
   Agnostic x -> x

extractGADT :: GADT a -> a
extractGADT v = case v of
   IntSpecific x -> x
   CharSpecific x -> x

The difference is that extractGADT could make use of the fact that the types inside are actually constrained to a single one, to do stuff like

extractGADT' :: GADT a -> a
extractGADT' v = case v of
   IntSpecific x -> x + 5
   CharSpecific x -> toUpper x

A standard Hindley-Milner compiler would not be able to handle this: it would assume that a is everywhere the same within the whole definition. In fact, technically speaking it is also everywhere the same in GHC, except that some parts of the code have additional constraints a ~ Int or a ~ Char in scope, respectively. But it's not enough to add handling for that: this additional information must not be allowed to escape the scope (because it would just not be true), and that requires an extra check.

leftaroundabout
  • 117,950
  • 5
  • 174
  • 319
  • About the Hindley-Milner part: do I understand it correctly that HM would fail at unifying the types of `x + 5` and `toUpper x`? – pgmcr Aug 12 '23 at 19:55
  • Yes. The problem being to even try unifying them. – leftaroundabout Aug 12 '23 at 22:00
  • 1
    @pgmcr Note that with GADTs it is possible to write terms that have _multiple_ types, none of which is more general than the other. See [this question](https://stackoverflow.com/questions/72906927/multiple-types-for-f-in-this-picture) for more information. Therefore, we can't hope to have full type inference. – chi Aug 13 '23 at 08:28
4

The difference is in how polymorphic the return type is. If it is fully polymorphic, you can use the syntax, but any specialization requires the GADTs syntax. For example:

Foo :: Int -> Bar a b c

Because a, b, and c are all distinct type variables, this fits in the GADTSyntax extension. But if you were to write any of these, you'd need GADTs instead:

Foo :: Int -> Bar Int b c
Foo :: Int -> Bar a b b
Foo :: Eq a => Int -> Bar a b c
-- (actually, ExistentialQuantification is enough for this last one, but GADTs also works)

With fancy types like these, we lose the property that terms have principal types. You can still use standard type inference, but sometimes it will have to make an arbitrary choice, which could result in type errors even though everything is type safe; to avoid that situation, GHC simply refuses to make those arbitrary choices and reports an error as early as possible instead.

Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380