12

In main I can read my config file, and supply it as runReader (somefunc) myEnv just fine. But somefunc doesn't need access to the myEnv the reader supplies, nor do the next couple in the chain. The function that needs something from myEnv is a tiny leaf function.

How do I get access to the environment in a function without tagging all the intervening functions as (Reader Env)? That can't be right because otherwise you'd just pass myEnv around in the first place. And passing unused parameters through multiple levels of functions is just ugly (isn't it?).

There are plenty of examples I can find on the net but they all seem to have only one level between runReader and accessing the environment.


I'm accepting Chris Taylor's because it's the most thorough and I can see it being useful to others. Thanks too to Heatsink who was the only one who attempted to actually directly answer my question.

For the test app in question I'll probably just ditch the Reader altogether and pass the environment around. It doesn't buy me anything.

I must say I'm still puzzled by the idea that providing static data to function h changes not only its type signature but also those of g which calls it and f which calls g. All this even though the actual types and computations involved are unchanged. It seems like implementation details are leaking all over the code for no real benefit.

Community
  • 1
  • 1
Richard Huxton
  • 21,516
  • 3
  • 39
  • 51
  • Why not just pass `myEnv` as an explicit argument to that tiny leaf function? – is7s Jun 27 '12 at 12:40
  • edited to clarify wrt @is7s comment – Richard Huxton Jun 27 '12 at 12:57
  • What you're running into here is the fact that the type system in Haskell keeps you honest. If you want to have static data that everything depends on, you're forced to acknowledge it in the types (or rather, the compiler infers the dependence on the environment - all of the code in my post works without type signatures.) You can't sneakily subvert the type system without using 'unsafe' code, like the call to `unsafePerformIO` in @Heatsink's answer. All part of [wearing the hair shirt](http://research.microsoft.com/en-us/um/people/simonpj/papers/haskell-retrospective/haskellretrospective.pdf)! – Chris Taylor Jun 27 '12 at 19:53
  • Chris - the types aren't changing though. There's no difference between f = 7 and f = . The config-file is automatically type-checked on reading, so I can't see why the compiler should treat the two approaches as different. – Richard Huxton Jun 27 '12 at 20:02

5 Answers5

9

You do give everything the return type of Reader Env a, although this isn't as bad as you think. The reason that everything needs this tag is that if f depends on the environment:

type Env = Int

f :: Int -> Reader Int Int
f x = do
  env <- ask
  return (x + env)

and g calls f:

g x = do
  y <- f x
  return (x + y)

then g also depends on the environment - the value bound in the line y <- f x can be different, depending on what environment is passed in, so the appropriate type for g is

g :: Int -> Reader Int Int

This is actually a good thing! The type system is forcing you to explicitly recognise the places where your functions depend on the global environment. You can save yourself some typing pain by defining a shortcut for the phrase Reader Int:

type Global = Reader Int

so that now your type annotations are:

f, g :: Int -> Global Int

which is a little more readable.


The alternative to this is to explicitly pass the environment around to all of your functions:

f :: Env -> Int -> Int
f env x = x + env

g :: Env -> Int -> Int
g x = x + (f env x)

This can work, and in fact syntax-wise it's not any worse than using the Reader monad. The difficulty comes when you want to extend the semantics. Say you also depend on having an updatable state of type Int which counts function applications. Now you have to change your functions to:

type Counter = Int

f :: Env -> Counter -> Int -> (Int, Counter)
f env counter x = (x + env, counter + 1)

g :: Env -> Counter -> Int -> (Int, Counter)
g env counter x = let (y, newcounter) = f env counter x
                  in (x + y, newcounter + 1)

which is decidedly less pleasant. On the other hand, if we are taking the monadic approach, we simply redefine

type Global = ReaderT Env (State Counter)

The old definitions of f and g continue to work without any trouble. To update them to have application-counting semantics, we simply change them to

f :: Int -> Global Int
f x = do
  modify (+1)
  env <- ask
  return (x + env)

g :: Int -> Global Int
g x = do
  modify(+1)
  y <- f x
  return (x + y)

and they now work perfectly. Compare the two methods:

  • Explicitly passing the environment and state required a complete rewrite when we wanted to add new functionality to our program.

  • Using a monadic interface required a change of three lines - and the program continued to work even after we had changed the first line, meaning that we could do the refactoring incrementally (and test it after each change) which reduces the likelihood that the refactor introduces new bugs.

Chris Taylor
  • 46,912
  • 15
  • 110
  • 154
5

Nope. You totally do tag all the intervening functions as Reader Env, or at least as running in some monad with an Env environment. And it totally does get passed around everywhere. That's perfectly normal -- albeit not as inefficient as you might think, and the compiler will often optimize such things away in many places.

Basically, anything that uses the Reader monad -- even if it's very far down -- should be a Reader itself. (If something doesn't use the Reader monad, and doesn't call anything else that does, it doesn't have to be a Reader.)

That said, using the Reader monad means that you don't have to pass the environment around explicitly -- it's handled automatically by the monad.

(Remember, it's just a pointer to the environment getting passed around, not the environment itself, so it's quite cheap.)

Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
  • 1
    OK, so it's not that I've mis-understood something. But in that case what do I gain over passing around the actual environment itself? I mean, it's all immutable, lazy etc. etc. so not actually copying anything. It looks like it's just adding the word "Reader" to the function header and complicating my code. – Richard Huxton Jun 27 '12 at 13:43
  • 1
    The `Reader` monad by itself...doesn't get you all that much, in all honesty. But, for example, you can add in more monads later if you use `ReaderT`, or you can give yourself special syntax for getting specific parts of the environment. But really, the point is that at the end of the day, explicitly passing around the `env` complicates your code more than using the `Reader` monad will. – Louis Wasserman Jun 27 '12 at 13:55
3

These are truly global variables, since they are initialized exactly once in main. For this situation, it's appropriate to use global variables. You have to use unsafePerformIO to write them if IO is required.

If you're only reading a configuration file, it's pretty easy:

config :: Config
{-# NOINLINE config #-}
config = unsafePerformIO readConfigurationFile

If there are some dependences on other code, so that you have to control when the configuration file is loaded, it's more complicated:

globalConfig :: MVar Config
{-# NOINLINE globalConfig #-}
globalConfig = unsafePerformIO newEmptyMVar

-- Call this from 'main'
initializeGlobalConfig :: Config -> IO ()
initializeGlobalConfig x = putMVar globalConfig x

config :: Config
config = unsafePerformIO $ do
  when (isEmptyMVar globalConfig) $ fail "Configuration has not been loaded"
  readMVar globalConfig

See also:

  1. Proper way to treat global flags in Haskell
  2. Global variables via unsafePerformIO in Haskell
Community
  • 1
  • 1
Heatsink
  • 7,721
  • 1
  • 25
  • 36
  • 5
    I would never suggest using `unsafe*` ever except when the problem explicitly demands it (e.g. "How do I break referential transparency to avoid having to use `Reader` all the time"). In this situation, `unsafePerformIO` is not warranted imho. – dflemstr Jun 27 '12 at 14:10
  • 2
    The first code example is no more unsafe than using `readFile`, since its only side effect is to interleave IO-based file input with pure code. This is IO code that has a pure interface; isn't that what `unsafePerformIO` is for? – Heatsink Jun 27 '12 at 14:27
  • It doesn't have a pure interface - the value of `globalConfig` depends on whether you evaluate it before or after you run the initialise function! This is bad. Yes, it's sort of the same problem as with `readFile`, but at least in that case you're in `IO` so you *know* to expect weird things. – Ben Millwood Jun 27 '12 at 14:50
  • @BenMillwood: I'm referring to the first code example, you're referring to the second. The first code example has two lines of code, and neither contains `globalConfig`. – Heatsink Jun 27 '12 at 15:00
  • @Heatsink: ah, yes. Sorry, I skimread a bit. The first is still a bad idea - suppose the configuration file is edited during program operation. Then the value of `config` changes, and in combination with lazy evaluation can change *depending on which compiler optimisations are active*. Again, it's the same problem as lazy IO, but I happen to think the prevalence of lazy IO in haskell is *also* a bad thing. – Ben Millwood Jun 27 '12 at 15:03
  • (Also, you haven't used any of the normal safeguards to ensure that the IO isn't duplicated - you may find that if the config is written during program operation, you can even get `config` taking different values in different contexts. That's a problem you can't get with lazy IO). – Ben Millwood Jun 27 '12 at 15:07
  • @BenMillwood Changing the configuration file at run time is exactly what I had in mind when comparing with `readFile`. I agree that if lazy IO is bad, then so is this. Since you reminded me, I added `NOINLINE` to prevent the variable or IO action from being duplicated. – Heatsink Jun 27 '12 at 15:19
  • @RobAgar: but this answer uses techniques generally accepted to be dangerous and un-Haskelly, so I think the downvotes are legitimate; furthermore, there *are* comments saying why it's a bad answer, from two of the three downvoters. This is worse than lazy IO because you don't have the `IO` marker to advise you that bad things are happening. – Ben Millwood Jun 27 '12 at 17:06
  • Well, responding to the answer itself, `Reader` is not global variables; it's nested scopes, because we have [`local :: MonadReader r m => (r -> r) -> m a -> m a`](http://hackage.haskell.org/packages/archive/mtl/latest/doc/html/Control-Monad-Reader-Class.html#v:local), which executes a computation in a modified environment. – Luis Casillas Jun 27 '12 at 17:56
  • Thank you @Heatsink - this is the only answer that actually directly addresses the global-variable nature of a config file. I get the impression this answer is perhaps being overly downvoted - any function called unsafeX clearly shouldn't be used before you read its documentation. – Richard Huxton Jun 27 '12 at 18:47
3

Another technique which might be useful is to pass the leaf function itself, partially applied with the value from the config file. Of course this only makes sense if being able to replace the leaf function is somehow to your advantage.

Rob Agar
  • 12,337
  • 5
  • 48
  • 63
0

If you don't want to make everything down to the tiny leaf function be in the Reader monad, does your data allow you to extract the necessary item(s) out of the Reader monad at the top level, and then pass them as ordinary parameters down through to the leaf function? That would eliminate the need for everything in between to be in Reader, although if the leaf function does need to know that it's inside Reader in order to use Reader's facilities then you can't get away from having to run it inside your Reader instance.

Matthew Walton
  • 9,809
  • 3
  • 27
  • 36