Efficiently abstracting over datatype arity

Question

As everyone knows, you can easily build n-tuples out of 2-tuples.

record Twople (A B : Set) : Set where
  constructor _,_
  field
    fst : A
    snd : B

n-ple : List Set -> Set
n-ple = foldr Twople Unit

(Agda syntax, but it'll work in Idris, and can be made to work in Haskell, Scala...)

Likewise, you can build n-ary sum types out of binary sums.

data Either (A B : Set) : Set where
  left : A -> Either A B
  right : B -> Either A B

OneOf : List Set -> Set
OneOf = foldr Either Void

Simple, and pretty, but inefficient. Plucking the rightmost item from an n-ple takes O(n) time because you have to unpack every nested Twople on the way. n-ple is more like a heterogeneous list than a tuple. Likewise, in the worst case, a OneOf value lives at the end of n-1 rights.

The inefficiency can be mitigated by manually unpacking the fields, and inlining the constructor cases, of your datatypes:

record Threeple (A B C : Set) : Set where
  constructor _,_,_
  field
    fst : A
    snd : B
    thd : C

data Threeither (A B C : Set) : Set where
  left : A -> Threeither A B C
  middle : B -> Threeither A B C
  right : C -> Threeither A B C

Picking an item from a Threeple and performing case-analysis on a Threeither are both now O(1). But there's a lot of typing involved, and not the good kind - Fourple, Nineither, Hundredple, and so on, must all be separately defined datatypes.

Must I choose between O(1) time and O(1) lines of code? Or can dependent types help me out? Can one efficiently abstract over datatype arity?

You can define all these data types using a function that computes their [descriptions](https://personal.cis.strath.ac.uk/conor.mcbride/levitation.pdf). Then a hypothetical type checker can turn described data types into their usual form. Note also that with dependent types it's `Threeple (A : Set) (B : A -> Set) (C : {x : A} -> B x -> Set)` rather than `Threeple (A B C : Set)`. — effectfully, May 21 '16 at 03:03
There's a pedagogy PhD to be written about the usual co-incidence of dependently typed programming and severe puns. — Cactus, May 22 '16 at 12:52

score 6 · Accepted Answer · answered May 21 '16 at 07:36

For O(1) field access with O(1) code, we need an array as a primitive object, or some heterogeneous or dependent generalization that is still implemented as an array. Agda doesn't have anything like this.

For n-ary sums, the situation is a bit more nuanced, but things also depend on machine implementation. Here, O(1) could reasonably mean that we're able to pattern match on an arbitrary constructor with one pointer dereference and one constructor tag check, just like with natively defined sum types. In Agda one could try:

open import Data.Nat
open import Data.Vec

record NSum {n}(ts : Vec Set n) : Set₁ where
  field
    ix  : Fin n
    dat : lookup ix ts

Of course, this depends on Data.Fin being implemented as machine (big)integers, which is AFAIK not the case in current Agda. We should try Data.Nat instead, since it's implemented efficiently:

open import Data.Nat hiding (_<_)
open import Agda.Builtin.Nat using (_<_)
open import Data.Bool
open import Data.List

lookup : ∀ {α}{A : Set α}(as : List A) n → T (n < length as) → A
lookup [] zero ()
lookup [] (suc n) ()
lookup (a ∷ as) zero    p = a
lookup (a ∷ as) (suc n) p = lookup as n p

record NSum (ts : List Set) : Set₁ where
  constructor nsum
  field
    ix       : ℕ
    {bound}  : T (ix < length ts)
    dat      : lookup ts ix bound

foo : NSum (ℕ ∷ Bool ∷ ℕ ∷ Bool ∷ [])
foo = nsum 0 10

bar : NSum (ℕ ∷ Bool ∷ ℕ ∷ Bool ∷ []) → ℕ
bar (nsum 2 dat) = dat
bar _            = 3

Note that we used Boolean _<_ instead of the default _<_, because the proof involving the former takes up O(1) space. Also, lookup only runs at compile time for most use cases, so it doesn't spoil our O(1).

You could also use an irrelevant proof of `ix < length ts` rather than a `Bool`ean one by using a dotted field. This way you get to have the structure you can easily perform induction on without the runtime cost! — gallais, May 21 '16 at 16:41
@gallais Dotted fields didn't work for me. I can't pattern match on them, so can't write `lookup`, and the builtin proof irrelevance postulate has been removed from the latest standard lib/Agda version. — András Kovács, May 21 '16 at 16:44
Something [like this](https://gist.github.com/gallais/6e57548a07dc28288a619ed686867a3c) works fine. The trick is to define the inversion functions as auxiliary lemma (one has to do the same sort of thing in Coq when inverting `Prop`s whilst building something in `Type`). — gallais, May 21 '16 at 17:00
OK, I don't understand dotted irrelevance. So we can match absurd patterns, but not normal patterns, but still pass irrelevant `pr` to `1+m≤1+n-inv` which expects a relevant parameter and even does pattern matching. — András Kovács, May 21 '16 at 17:18
You can feed an irrelevant argument to a function but it will produce an irrelevant output. Here the recursive call expects an irrelevant argument so it's all fine. ;) — gallais, May 21 '16 at 17:19

Efficiently abstracting over datatype arity

1 Answers1