3

Suppose I have the following Haskell code:

data Option
    = Help
    | Opt1 Int Double String
    -- more options would be here in a real case

handleOption :: Option -> IO ()
handleOption option = case option of
    Help -> handleHelp
    Opt1 n f s -> handleOpt1 n f s

handleHelp :: IO ()
handleHelp = print "help"

handleOpt1 :: Int -> Double -> String -> IO ()
handleOpt1 n f s = print (n, f, s)

In the above code, it seems to me a waste to deconstruct the object ahead of time in the sense that I could keep the data bundled neatly together. Now I have to pass each part of Opt1 individually or create a single separate data type to haul them along. Is it possible to pass in the entire Opt1 to handleOpt1 while not allowing a general Option instance being passed in, such as making handleOpt1 Help a compile error?

Example pseudo code below:


data Option
    = Help
    | Opt1 Int Double String

handleOption :: Option -> IO ()
handleOption option = case option of
    Help -> handleHelp
    opt1 @ Opt1{} -> handleOpt1 opt1

handleHelp :: IO ()
handleHelp = print "help"

handleOpt1 :: Option:Opt1 -> IO ()
handleOpt1 (Opt1 n f s) = print (n, f, s)
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Thomas Eding
  • 35,312
  • 13
  • 75
  • 106
  • The compiler can only check that the type is correct. It cannot check that a value passed to `handleOpt1` was constructed with the right constructor (in general; it could in principle check that values that are explicitly passed in the source code are, but since usually passed values are computed during run time, nobody wrote the code to check the exceptional case). – Daniel Fischer Apr 19 '13 at 21:47
  • You _could_ create a typeclass for the options, figure out some way to combine them (and extract useful information from that combination) using existential quantification... Or you could just stop worrying about trivial things. If there's something in your program killing your performance, this probably isn't it. – Cubic Apr 19 '13 at 22:03
  • By "waste", I meant on a design level, not on a performance level. Updated my post. – Thomas Eding Apr 19 '13 at 22:30

4 Answers4

6

You can use GADTs for this.

{-# LANGUAGE GADTs #-}

data Option a where
    Help :: Option ()
    Opt1 :: Int -> Double -> String -> Option (Int, Double, String)

handleOption :: Option a -> IO ()
handleOption option = case option of
    Help          -> handleHelp
    opt1 @ Opt1{} -> handleOpt1 opt1

handleHelp :: IO ()
handleHelp = print "help"

handleOpt1 :: Option (Int, Double, String) -> IO ()
handleOpt1 (Opt1 n f s) = print (n, f, s)

With GADTs, you give more type information to the compiler. For handleOpt1, since it only accepts Option (Int, Double, String), the compiler knows Option () (i.e. Help) will never be passed in.

That said, using GADTs makes quite a few other things harder. For instance, automatic deriving (e.g. deriving (Eq, Show)) generally doesn't work with them. You should carefully consider the pros and cons of using them in your case.

scvalex
  • 14,931
  • 2
  • 34
  • 43
2

There is a big chance that GHC inlines handleHelp and handleOpt1, thus avoiding the call overhead - look at the generated Core (compiler's intermediate representation) to find out for sure.

If, for some reason, these functions aren't being inlined, you can mark them with the INLINE pragma:

handleHelp :: IO ()
handleHelp = print "help"
{-# INLINE handleHelp #-}

handleOpt1 :: Option -> IO ()
handleOpt1 (Opt1 n f s) = print (n, f, s)
{-# INLINE handleOpt1 #-}

You can also rely on the inliner to avoid deconstructing the argument in handleOption:

handleOpt1 :: Option -> IO ()
handleOpt1 (Opt1 n f s) = print (n, f, s)
handleOpt1 _ = undefined

The undefined is just to silence the warning about the non-exhaustive pattern match. Alternatively, you can remove this line and enable -fno-warn-incomplete-patterns for this module.

Looking at the generated Core we can see that the undefined branch of handleOpt1 was eliminated:

handleOpt2
  :: Option
     -> State# RealWorld
     -> (# State# RealWorld, () #)
handleOpt2 =
  \ (ds_dl7 :: Option)
    (eta_Xh :: State# RealWorld) ->
    case ds_dl7 of _ {
      Help -> ...  
      Opt1 n_aaq f_aar s_aas -> ...

main1
  :: State# RealWorld
     -> (# State# RealWorld, () #)
main1 =
  \ (eta_Xk :: State# RealWorld) ->
    handleOpt2 (Opt1 2 3.0 "") eta_Xk

I prefer the original version, though, since it excludes the possibility of pattern match failure in handleOpt1.

Community
  • 1
  • 1
Mikhail Glushenkov
  • 14,928
  • 3
  • 52
  • 65
2

In this particular example it seems way more natural to solve the "problem" by ditching handleHelp and handleOpt1 and making them both separate equations of the handleOption function:

handleOption :: Option -> IO ()

handleOption Help = print "help"

handleOption (Opt1 n f s) = print (n, f, s)

This gets you the best of both worlds. You can write a separate equation for each case (so even if each case is large you keep them from melding into a single giant equation), you don't have to write any boilerplate "dispatch" function, and you don't have to name the parts of the Opt1 case until you actually need to use them.

Ben
  • 68,572
  • 20
  • 126
  • 174
0

I like Ben's answer, but alternatively, you could just introduce more types.

data Opt1Params = Opt1Params Int Double String

data Option = Help | Opt1 Opt1Params

handleOption Help = handleHelp handleOption (Opt1 params) = handleOpt1 params

handleOpt1 (Opt1Params n f s) = ...

Sebastian Redl
  • 69,373
  • 8
  • 123
  • 157