4

Most examples for Haskell generics do small bits of computation recursively around the :+: and :*: types/constructors. I seem to be solving a problem where this may not work out.

I'm trying to write a generic validation function that takes any two records having the same shape and validates each field in recordA against a validation function defined in recordB to return an error record of the same shape OR recordA itself.

Example:

-- Some type synonyms for better readability
type Name = Text
type Age = Int
type Email = Text
type GeneralError = Text
type FieldError = Text

-- a polymorphic record to help preserve the shape of various records
data User n a e = User {name :: n, age :: a, email :: e}

-- the incoming value which has been parsed into the correct type
-- but still needs various values to be validated, eg length, format, etc
type UserInput = User Name Age Email

-- specifies the exact errors for each field
type UserError = User [FieldError] [FieldError] [FieldError]

-- specifies how to validate each field. the validator is being passed
-- the complete record along with the specific field to allow
-- validations that depends on the value of another field
type UserValidator = User
                     (UserInput -> Name -> Either ([GeneralError], [FieldError]) Name)
                     (UserInput -> Age -> Either ([GeneralError], [FieldError]) Age)
                     (UserInput -> Email -> Either ([GeneralError], [FieldError]) Email)

let (validationResult :: Either ([GeneralError], UserError) UserInput)
  = genericValidation (i :: UserInput) (v :: UserValidator)

Now, the reason why doing this recursively around :*: might not work is, that one needs to look at the result of every validation function and then decide if the return value should be a Left ([GeneralError], UserError) or a Right UserInput. We cannot evaluate to a Left value on the first validation function that fails.

Is there any way to write this genericValidation function using Haskell generics?

Saurabh Nanda
  • 6,373
  • 5
  • 31
  • 60
  • I don't understand why this isn't supposed to be doable recursively. Just return _both_ possible results while you're unsure which one will be needed. — That said, I don't think “any two records having the same shape” is something you can ever have quite reliably – it's undefined how a given record is translated to `:*:`-nestings. Though I don't see why the compiler would choose to give same-structure records differently-structured representations. – leftaroundabout Aug 04 '17 at 17:35
  • Do the errors for each field actually need to come in different slots of a datatype, or would something like a `Map FieldName FieldError` be acceptable? Also, why does the validation function for each field need access to the whole `UserInput`? – danidiaz Aug 04 '17 at 18:56
  • @danidiaz using a `Map` to represent the errors would be my last resort. It would mean a considerable loss in type safety and not significantly different from the current state-of-the-art (digestive functors). The validation function needs access to the whole User input to allow cases where the validation logic on one field depends on the value of another field. Eg. if participant type is "child", then age must be less than 10. – Saurabh Nanda Aug 05 '17 at 02:28
  • @leftroundabout even if both possible results are returned, I'm unable to wrap my head around the generic machinery which will pick between one of the two sets of results at the correct level of nesting. If something is expecting records of same shape, and differently shaped records are given to it (either due to programmer error, or compiler quirk), won't it result in a compile time error (a horribly complicated one, I presume, but a compile error nevertheless) – Saurabh Nanda Aug 05 '17 at 02:33

2 Answers2

4

Now, the reason why doing this recursively around :*: might not work is, that one needs to look at the result of every validation function and then decide if the return value should be a Left ([GeneralError], UserError) or a Right UserInput. We cannot evaluate to a Left value on the first validation function that fails.

The standard Applicative behaviour for Either is not the only reasonable behaviour for that type! As you said, when you're, eg, validating a form, you want to return a collection of all the errors that occurred, not just the first one. So here's a type that's structurally the same as Either but has a different Applicative instance.

newtype Validation e a = Validation (Either e a) deriving Functor

instance Semigroup e => Applicative (Validation e) where
    pure = Validation . pure
    Validation (Right f) <*> Validation (Right x) = Validation (Right $ f x)
    Validation (Left e1) <*> Validation (Left e2) = Validation (Left $ e1 <> e2)
    Validation (Left e) <*> _ = Validation (Left e)
    _ <*> Validation (Left e) = Validation (Left e)

When both computations failed, the composed computation also fails, returning the two errors composed using their Semigroup instance - both the errors, for some suitable notion of both. If both computations succeed, or only one of them fails, then Validation behaves like Either. So it's kind of like a Frankensteinian mishmash of the Either and Writer applicatives.

This instance does satisfy the Applicative laws, but I'll leave the proof to you. Oh, and Validation can't be made into a lawful Monad.


Forgive me for taking the liberty of rearranging your types a bit. I'm using a common trick for reusing the structure of a record at a variety of different types: parameterise the record by a type constructor. You recover the original record by applying the template to the Identity functor.

data UserTemplate f = UserTemplate {
    name :: f Name,
    age :: f Age,
    email :: f Email
}
type User = UserTemplate Identity

A useful newtype: a Validator is a function which takes an a and returns either the a or a monoidal summary of the errors.

newtype Validator e a = Validator { runValidator :: a -> Validation e a }

A useful class: HTraversable is like Traversable but for functors from the category of type constructors to Hask. (More on this in a previous question of mine.)

class HFunctor t where
    hmap :: (forall x. f x -> g x) -> t f -> t g
class HFunctor t => HTraversable t where
    htraverse :: Applicative a => (forall x. f x -> Compose a g x) -> t f -> a (t g)
    htraverse f = hsequence . hmap f
    hsequence :: Applicative a => t (Compose a g) -> a (t g)
    hsequence = htraverse id

Why's HTraversable relevant? TraversableClassic™ allows you to sequence Applicative effects like Validation over homogeneous containers like lists. But a record is rather more like a heterogeneous container: a record "contains" a bunch of fields, but each field has its own type. HTraversable is precisely the class for when you need to sequence Applicative actions over polymorphic containers.

Another useful class generalises zipWith to these heterogeneous containers.

class HZip t where
    hzip :: (forall x. f x -> g x -> h x) -> t f -> t g -> t h

Records constructed in the fashion of UserTemplate are traversable and zippable. (In fact they're typically HRepresentable - an analogous higher-order notion of Representable - which is a very useful property, though I won't dwell on it here.)

instance HFunctor UserTemplate where
    hmap f (UserTemplate n a e) = UserTemplate (f n) (f a) (f e)

instance HTraversable UserTemplate where
    htraverse f (UserTemplate n a e) = UserTemplate <$>
        getCompose (f n) <*>
        getCompose (f a) <*>
        getCompose (f e)

instance HZip UserTemplate where
    hzip f (UserTemplate n1 a1 e1) (UserTemplate n2 a2 e2) = UserTemplate (f n1 n2) (f a1 a2) (f e1 e2)

Hopefully it should be quite easy to see what a Generic or Template Haskell implementation of HTraversable and HZip for an arbitrary record fitting this pattern would do.

So, the plan is: write Validators for each field and then hzip these Validators along the object you want to validate. Then you can htraverse the result to get a Validation containing the validated object. This pattern works for field-by-field validation, per your question. If you need to look at multiple fields to validate your record, you can't use hzip (but of course you also can't use Generic).

type Validatable t = (HZip t, HTraversable t)
validate :: (Semigroup e, Validatable t) => t (Validator e) -> Validator e (t Identity)
validate t = Validator $ htraverse (Compose . fmap Identity) . hzip val t
    where val v = runValidator v . runIdentity

A particular validator for a type such as User basically involves picking a monoidal error and returning a record of validation functions. Here I'm defining a Monoid for UserError which lifts a monoidal e point-wise through each field of the record.

type UserError e = UserTemplate (Const e)

instance Semigroup e => Semigroup (UserError e) where
   x <> y = hzip (<>) x y

Now you can just define a record of validator functions.

type UserValidator = Validator ([GeneralError], UserError [FieldError])

validateEmail :: UserInput -> UserValidator Email
validateEmail i = Validator v
    where v e
            | '@' `elem` toString e = pure e
            | otherwise = Validation $ Left ([], UserTemplate [] [] [FieldError "missing @"])

validateName :: UserInput -> UserValidator Name
validateName = ...
validateAge :: UserInput -> UserValidator Age
validateAge = ...

userValidator :: UserInput -> UserValidator User
userValidator input = validate $ UserTemplate {
    name = validateName input,
    age = validateAge input,
    email = validateEmail input
}

You can make it easier to compose smaller validators - so that each validator doesn't need to know about the whole error structure - using lenses.

Benjamin Hodgson
  • 42,952
  • 15
  • 108
  • 157
  • "If you need to look at multiple fields to validate your record, you can't use hzip (but of course you also can't use Generic)." - - why do you say that, especially for generics? – Saurabh Nanda Aug 05 '17 at 02:19
  • 1
    @SaurabhNanda Sorry, I should've been clearer. Your question was about validating each field one at a time in a compositional (context-free) manner. If you need context-sensitive validation, where the behaviour of a validator is conditional on the values of some other field, `hzip` can't help you because it works on a field-by-field basis. An example of this would be something like "if the user is under 18, the credit card field must be empty", or "both of these passwords must be equal". Such a validator has to look at the whole record. Hope that answers your question – Benjamin Hodgson Aug 05 '17 at 08:34
2

This answer tries to comply with the requirement that field-specific errors should be stored in an appropriate slot. I don't tackle "general errors" because they are simpler to implement and this answer is sufficiently complicated as it is.

Instead of using polymorphic records, we will use regular records, augmented with the generics-sop library. This library lets you define and work with generic representations of records, in which each record field is wrapped in some type constructor. The generic representations are basically n-ary products parameterized by a type-level list of field types. Notice that the fields don't have names; if we want to manipulate the n-ary products directly, we need to work positionally.

import Data.Bifunctor (bimap)
import qualified GHC.Generics as GHC
import Generics.SOP
import Control.Applicative.Lift (Errors,runErrors,failure)

data User = User { name :: Name, age :: Age, email :: Email } deriving (Show,GHC.Generic)

instance Generic User -- this generic is from generics-sop

A field validation type that depends on Errors from transformers. Notice that it also receives the whole record r:

newtype Validator r a = 
    Validator { runValidator :: r -> a -> Errors [FieldError] a } 

Usher wraps a function that injects FieldErrors in the correct slot of an N-ary error record:

newtype Usher res xs a = Usher { getUsher :: res -> NP (K res) xs }

ushers returns an n-ary product with the appropriate Usher injectors for each field. Notice the Monoid constraint; without it we wouldn't be able to inject empty values in other fields.

ushers :: forall r xs res. (IsProductType r xs, Monoid res)
       => Proxy r -> NP (Usher res xs) xs
ushers _ =
    let expand (Fn injection) =
            Usher $ \res -> hexpand (K mempty) (unK (injection (K res)))
    in hliftA expand (injections @xs @(K res))

Another helper function not provided by generics-sop:

-- combine the individual fields of a list of uniform n-ary-products 
fold_NP :: forall w xs . (Monoid w, SListI xs) => [NP (K w) xs] -> NP (K w) xs
fold_NP = Prelude.foldr (hliftA2 (mapKKK mappend)) (hpure (K mempty))

The actual validation function. Notice that the list of validators is provided as an n-ary product (derived from the record r):

validate :: forall r xs . IsProductType r xs
         => NP (Validator r) xs -> r -> Either (NP (K [FieldError]) xs) r
validate validators r =
    let validators' = validators    :: NP (Validator r) xs
        rs = hpure (K r)            :: NP (K r) xs -- a copy of the record in each slot
        np = unZ (unSOP (from r))   :: NP I xs -- generic representation of the record
        validated                   :: NP (Errors [FieldError])   xs
        validated = hliftA3 (\(Validator v) (K rec) (I a) -> v rec a) validators' rs np

        ushers' = ushers (Proxy @r) :: NP (Usher [FieldError] xs) xs -- error injectors
        injected                    :: NP (Errors [NP (K [FieldError]) xs]) xs
        injected = hliftA2 (\(Usher usher) errors ->
                                case runErrors errors of
                                    Right a' -> pure a'
                                    Left es -> failure [usher es])
                           ushers'
                           validated
    in bimap fold_NP (to . SOP . Z) . runErrors . hsequence $ injected

Finally, an example:

main :: IO ()
main = do
   let valfail msg = Validator (\_ _ -> failure [msg])
       validators = valfail "err1" :* valfail "err2" :* valfail "err3" :* Nil 
   print $ validate validators (User "Foo" 40 "boo@bar")
   -- returns Left (K ["err1"] :* (K ["err2"] :* (K ["err3"] :* Nil)))
danidiaz
  • 26,936
  • 4
  • 45
  • 95
  • Here's a gist with the complete code https://gist.github.com/danidiaz/65254521f8db91307eab2cd4d6a55f0c the answer here omits some required language extensions. – danidiaz Aug 05 '17 at 14:44
  • Brilliant! I have a feeling that my answer lies with generics - sop and the way you've approached it. Only if I can wrap my head around the library! Btw, are you not using polymorphic records due to some constraint or it just makes the example easier? IIUC both the records, ie input and validators, can converted to SOP forms and the technique you have presented can be applied from there, right? – Saurabh Nanda Aug 05 '17 at 16:22
  • Also just checking, does generics-sop have some form of zipWith function that allows one to zip two records of the same shape? – Saurabh Nanda Aug 05 '17 at 16:23
  • @Saurabh Nanda there exist `hzipWith` and `hczipWith` functions http://hackage.haskell.org/package/generics-sop-0.3.1.0/docs/Generics-SOP.html#v:hzipWith – danidiaz Aug 05 '17 at 18:41
  • @Saurabh Nanda See this github issue https://github.com/well-typed/generics-sop/issues/47 for a collection of links to learning resources. I prefer to avoid polymorphic records if I can help it, because I feel they are a bit intrusive. And yes, the idea is to construct validators, parsers, serializers... based on the structure of "plain" records. At the cost of boilerplate, sometimes I define a record pattern synonym to help me work with the n-ary products directly, like pattern `User' {name', age', email'} = name' :* age' :* email' :* Nil))`. – danidiaz Aug 05 '17 at 18:43
  • 1
    @danidiaz Hmm, perhaps I should add a `Monoid` instance for `NP`, then `fold_NP` could just be `Data.Foldable.fold`. – kosmikus Aug 06 '17 at 15:45