1

Suppose you have a naive vector type, such as

data Vector a
  = V2 a a
  | V3 a a a

I am interested in how you can — and how you should — implement particular logic specific to the type's constructor. What I mean can be clarified with an example.

Suppose I want to implement:

instance Num a => Num (Vector a) where
  (+) = 
  (*) = _
  abs = _
  signum = _
  fromInteger = _
  negate = _

Due to pattern matching, I can very easily match V2 with V2. However, what if I want it to be invalid for a particular combination, such as between V2 and V3?

In this case, is it required to make individual types for V2and V3? Then, however, do I not lose the polymorphic properties they currently have, being the same type? For example, if I begin building operations consuming my Vector, I currently need only one function. However, if I split into two types, I need two distinct functions.

How can I solve this problem by thinking in types?

3 Answers3

3

So you want your ADT cases to look like distinct types sometimes, so they can't be accidentally mixed, yet other times you want to be able to treat them homogeneously? That's what GADTs are for (among other things).

Give your type a phantom type parameter, and force it to be different for the different constructors. When you don't care, you can just make that type parameter generic:

data Len2
data Len3

data Vector len a where
  V2 :: a -> a -> Vector Len2 a
  V3 :: a -> a -> a -> Vector Len3 a

vectorAdd :: Num a => Vector len a -> Vector len a -> Vector len a
vectorAdd (V2 x1 y1) (V2 x2 y2) = V2 (x1+x2) (y1+y2)
vectorAdd (V3 x1 y1 z1) (V3 x2 y2 z2) = V3 (x1+x2) (y1+y2) (z1+z2)

vectorFst :: Vector len a -> a
vectorFst (V2 x _) = x
vectorFst (V3 x _ _) = x

In vectorAdd the compiler recognizes that V2 and V3 can never be matched because they always have different len parameters, yet you still can write vectorFst that works for any len.

Next level - don't create special-purpose Len2 and Len3 types, use numeric type-level literals, which GHC supports:

data Vector len a where
  V2 :: a -> a -> Vector 2 a
  V3 :: a -> a -> a -> Vector 3 a
Fyodor Soikin
  • 78,590
  • 9
  • 125
  • 172
2

In this case, is it required to make individual types for V2 and V3? Then, however, do I not lose the polymorphic properties they currently have, being the same type? For example, if I begin building operations consuming my Vector, I currently need only one function. However, if I split into two types, I need two distinct functions.

You need two distinct functions -- but they don't have to have two distinct names! Make a class for all the base operations that make sense on vectors of either type; then you can build higher-level functions in terms of them if you like. For example:

data V2 a = V2 a a
data V3 a = V3 a a a

class Vector v where
    magnitude :: Floating a => v a -> a
    lerp :: Floating a => a -> v a -> v a -> v a

scalarLerp :: Floating a => a -> a -> a -> a
scalarLerp alpha a b = alpha*a + (1-alpha)*b

instance Vector V2 where
    magnitude (V2 x y) = sqrt (x^2 + y^2)
    lerp alpha (V2 x y) (V2 x' y') = V2
        (scalarLerp alpha x x')
        (scalarLerp alpha y y')

instance Vector V3 where
    magnitude (V3 x y z) = sqrt (x^2 + y^2 + z^2)
    lerp alpha (V3 x y z) (V3 x' y' z') = V3
        (scalarLerp alpha x x')
        (scalarLerp alpha y y')
        (scalarLerp alpha z z')

Then you could define, say,

midMagnitude :: (Vector v, Floating a) => v a -> v a -> a
midMagnitude v1 v2 = magnitude (lerp 0.5 v1 v2)

to compute the length of the average of two vectors, and it will enjoy that nice polymorphism property of working on either V2s or V3s even though it was only defined once.

(Actually, I would never put lerp in the Vector class; instead, I'd write class Applicative v => Vector v where ... and define lerp = liftA2 . scalarLerp. The development here was just for expository purposes.)

Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • I see! I like the use of Vector being a type class rather than a data type, in order to use polymorphism! Very interesting, thanks you very much :) – TimeTravelPenguin May 19 '22 at 02:18
1

(Because you probably don't want to make Vector a Num, I'll be referring to monad operations instead.)

Haskell's [] monad solves this problem by simply going through every possible combination of elements. For example, [(+1),(*2)] <*> [3,6,5] = [4,7,6, 6,12,10]. For your Vector implementation which only supports two or three elements, though, this won't work. Would V2 (+1) (*2) <*> V3 3 6 5 have two or three elements? For this reason, you probably want to expand your type to support all lengths of lists, which would just bring you right back to [].

If you want to maintain this "zipping" operation, though... things get more complicated. Possible, but complicated. The basic idea is to embed the length of the vector into the type while still maintaining polymorphism. We can essentially do this with type-level natural numbers...

data N = Z | S N

(Z stands for "zero" and S for "successor", e.g. S (S (S (S Z))) = 4)

...and we can utilize this with data kinds and GADTs:

data Vector n a where
    Nil :: Vector 'Z a
    (:^) :: a -> Vector n a -> Vector ('S n) a

Here, n is the length of the vector. For example, V2 would be represented by Vector (S (S Z)) a and V3 by Vector (S (S (S Z))) a. Notably, though, you can make the length polymorphic, e.g. Vector (S n) a is the type of all non-empty vectors (length is at least one). Unfortunately, (GHC's) type inference breaks down in the presence of GADTs, so you'll probably need scoped type variables (and a lot of explicit type signatures) to do anything useful.

All the actual data definition is saying is that Nil has length zero, and an element prepended to a vector (x :^ xs) has length one more than the vector given.

As an example, here's a Functor instance, defining fmap analogously to []'s map:

instance Functor (Vector n) where
    fmap :: forall a b. (a -> b) -> Vector n a -> Vector n b
    fmap f = go where
        go :: Vector k a -> Vector k b
        go Nil = Nil
        go (x:^xs) = f x :^ go xs

Notice all of the type signatures littered around. Unfortunately, this is almost always necessary because GHC isn't too keen to assume polymorphism in the presence of GADTs. The forall here helps scope a and b so that they can be used in the type for go. If it wasn't there, then the a and b in fmap and the a and b in go would be completely separate type variables, just with similar names.

To define more interesting functions, GHC needs to know that we want to interact with the length of the list, so we define a singleton type which "refines" the length...

data P n where
    PZ :: P 'Z
    PS :: P n -> P ('S n)

(read P n as "a proof of n's natural value)

...and a class which gives access to that type:

class Nat n where
    nat :: P n
instance Nat 'Z where
    nat = PZ
instance Nat n => Nat (S n) where
    nat = PS nat

From here, we can define replicateV analogously to []'s replicate:

replicateV :: forall n a. Nat n => a -> Vector n a
replicateV x = go nat where
    -- read as "go takes a proof of k's value and returns a vector of length k"
    go :: P k -> Vector k a
    go PZ = Nil
    go (PS p) = x :^ go p

From here, you can define a vzip :: (a -> b -> c) -> Vector n a -> Vector n b -> Vector n c similar to []'s zip.

(See also this question which expands on these ideas in a slightly different way.)

fluffyyboii
  • 635
  • 6
  • 15
  • 1
    “If it wasn't there, then the `a` and `b` in `fmap` and the `a` and `b` in `go` would be completely separate type variables, just with similar names.” Is this related to the error of rigid types? I vaguely recall having that exact issue in this circumstance. That approach with `foreach` is an interesting one, which I have not learnt before. Thank you very much for your information. It is very helpful knowledge! – TimeTravelPenguin May 18 '22 at 11:43